Mass spectrometry-cleavable cross-linker

ABSTRACT

Provided herein is synthesis of a novel acidic acid residue targeting sulfoxide-containing MS-cleavable homobifunctional cross-linker. The novel mass spectrometry-cleavable cross-linking agents can be used in mass spectrometry to facilitate structural analysis of intra-protein interactions in proteins and inter-protein interactions in protein complexes. Also disclosed herein are data based on the novel MS-cleavable homobifunctional cross-linker that are complimentary to amine-reactive sulfoxide-containing MS-cleavable reagents.

PRIORITY AND CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No. 62/345,670, filed on Jun. 3, 2016, which is hereby incorporated by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with Government support under Grant No. R01GM074830, R01GM074830-1052, and R01GM106003 awarded by the National Institutes of Health. The Government has certain rights in this invention.

SEQUENCE LISTING IN ELECTRONIC FORMAT

The present application is being filed along with a Sequence Listing as an ASCII text file via EFS-Web. The Sequence Listing is provided as a file entitled UCI012003ASEQLIST.txt, created and last saved on Jun. 2, 2017, which is 58,182 bytes in size. The information in the electronic format of the Sequence Listing is incorporated herein by reference in its entirety in accordance with 35 U.S.C. § 1.52(e).

BACKGROUND Field

The present disclosure relates generally to mass spectrometry-cleavable cross-linkers and methods of using mass spectrometry-cleavable cross-linkers for probing proteins and protein complexes.

Description of the Related Art

The majority of proteins exert their functions in the form of protein complexes. These macromolecular assemblies and their protein-protein interactions play critical roles in regulating integral biological processes. As a result, perturbations of endogenous protein-protein interactions can result in deleterious effects on cellular activities.

Structural analyses of these complexes by traditional biophysical structural techniques such as x-ray crystallography and nuclear magnetic resonance (NMR) are frequently utilized to elucidate their topologies. Unfortunately, many large and heterogeneous complexes are refractory to such methods, ushering the development of new hybrid structural strategies.

SUMMARY

In some embodiments, an MS-cleavable cross-linker is provided. In some embodiments, the MS-cleavable cross-linker comprises at least one hydrazide reactive group, at least one sulfoxide group, and at least one collision-induced dissociation (CID) cleavable bond, wherein the MS-cleavable cross-linker is configured for mapping intra-protein interactions in a protein, or inter-protein interactions in a protein complex, or combinations thereof.

In some embodiments of the MS-cleavable cross-linker, the at least one hydrazide reactive group is configured to react with an activated acidic side chain in the protein or protein complex. In some embodiments of the MS-cleavable cross-linker, the at least one CID cleavable bond is a C—S bond adjacent to the at least one sulfoxide group. In some embodiments of the MS-cleavable cross-linker, the MS-cleavable cross-linker is dihydrazide sulfoxide (DHSO), having the structure:

In some embodiments of the MS-cleavable cross-linker, the two hydrazide reactive groups are separated by a 12.5 Å long spacer arm comprising the two symmetrical CID cleavable C—S flanking the one sulfoxide group.

In some embodiments, a method for synthesis of an MS-cleavable cross-linker is provided. In some embodiments, the method for synthesis of an MS-cleavable cross-linker comprises the steps of (i) providing disuccinimidyl sulfoxide (DSSO) in dicholormethane, (ii) adding tert-butyl carbazate to derive a first solution, (iii) stirring the first solution of step (ii), (iv) adding trifluoroacetic acid to derive an second solution, (v) stirring the second solution of step (iv), (vi) removing solvent from the second solution of step (v) in vacuo to derive an oil, (vii) dissolving the oil from step (vi) in methanol and adding trimethylamine to obtain a mixture, (viii) stirring the mixture from step (vii) to obtain a precipitate, (ix) collecting the precipitate from step (viii), washing the precipitate in fresh methanol, and drying the precipitate in vacuo, thereby obtaining the MS-cleavable cross-linker.

In some embodiments of the method for synthesis, stirring the first solution of step (ii) is performed at room temperature. In some embodiments of the method for synthesis, stirring the first solution of step (ii) is performed for about 6 h to about 24 h. In some embodiments of the method for synthesis, stirring the second solution of step (iv) is performed at room temperature. In some embodiments of the method for synthesis, stirring the second solution of step (iv) is performed for about 36 h to about 144 h. In some embodiments of the method for synthesis, stirring the mixture is performed at room temperature. In some embodiments of the method for synthesis, stirring the mixture is performed for about 10 min to about 40 min. In some embodiments of the method for synthesis, collecting the precipitate from step (viii), washing the precipitate in fresh methanol is performed about 1 to about 10 times. In some embodiments of the method for synthesis, the MS-cleavable cross-linker is DHSO, having the structure:

In some embodiments, a method for mapping intra-protein interactions in a protein, inter-protein interactions in a protein complex, or combinations thereof is provided. In some embodiments, the method for mapping intra-protein interactions comprises providing 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM) as an activating agent, activating at least one acidic residue by DMTMM in the protein or protein complex, providing an MS-cleavable cross-linker, wherein the MS-cleavable cross-linker comprises at least one hydrazide reactive group, at least one sulfoxide group, and at least one CID cleavable bond, cross-linking the MS-cleavable cross-linker to the at least one activated acidic residue in the protein or protein complex, digesting with an enzyme the protein or protein complex cross-linked to the MS-cleavable cross-linker, generating one or more peptide fragments of the protein or protein complex, wherein the one or more peptide fragments are chemically cross-linked to the MS-cleavable cross-linker, and identifying the one or more peptide fragments using tandem mass spectrometry (MS^(n)), thereby mapping intra-protein interactions in the protein and/or inter-protein interactions in the protein complex.

In some embodiments of the method for mapping intra-protein interactions, the MS-cleavable cross-linker is DHSO, having the structure

In some embodiments, a method for cross-linking mass spectrometry (XL-MS) for identifying one or more cross-linked peptides is provided. In some embodiments, the method for XL-MS comprises providing 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM) as an activating agent, activating at least one acidic residue by DMTMM in a protein or a protein complex, providing an MS-cleavable cross-linker, wherein the MS-cleavable cross-linker comprises at least one hydrazide reactive group, at least one sulfoxide group, and at least one CID cleavable bond, cross-linking the MS-cleavable cross-linker to the at least one activated acidic residue in the protein or protein complex, digesting with an enzyme the protein or protein complex cross-linked to the MS-cleavable cross-linker, generating one or more peptide fragments of the protein or protein complex, wherein the one or more peptide fragments are chemically cross-linked to the MS-cleavable cross-linker, performing a liquid chromatography-tandem mass spectrometry (LC-MS^(n)) analysis on the one or more cross-linked peptides, wherein the LC-MS¹ analysis comprises detecting the one or more cross-linked peptides by MS¹ analysis, selecting the one or more cross-linked peptides detected by MS¹ for MS² analysis, selectively fragmenting the at least one CID cleavable bond and separating the one or more cross-linked peptides during MS² analysis, sequencing the one or more cross-linked peptides separated during MS² analysis by MS³ analysis, and integrating data obtained during MS¹, MS² and MS³ analyses to identify the one or more cross-linked peptides.

In some embodiment of the method for XL-MS, the MS-cleavable cross-linker is DHSO, having the structure

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A shows structure of sulfoxide containing MS-cleavable cross-linker DSSO¹⁶.

FIG. 1B shows structure of sulfoxide containing MS-cleavable cross-linker DMDSSO¹⁷.

FIG. 1C shows structure of sulfoxide containing MS-cleavable cross-linker Azide-A-DSBSO¹⁸.

FIG. 1D shows an embodiment of a synthesis scheme of MS-cleavable cross-linker DHSO.

FIG. 2A-FIG. 2E show characteristic MS² fragmentation patterns for DHSO cross-linked peptides.

FIG. 2A shows a scheme of peptide cross-linking by DHSO in the presence of DMTMM.

FIG. 2B shows MS² fragmentation of DHSO inter-linked heterodimer α-β.

FIG. 2C shows MS² fragmentation of DHSO intra-linked peptide αintra.

FIG. 2D shows MS² fragmentation of dead-end modified peptide αDN.

FIG. 2E shows an embodiment of a conversion scheme of αS to αT.

FIG. 3A-FIG. 3D show MS^(n) analysis of DHSO inter-linked Ac-SR8 peptide.

FIG. 3A shows MS² spectra of DHSO interlinked Ac-SR8 at charge state [α-α]⁴⁺ (m/z 548.7623⁴⁺).

FIG. 3B shows MS² spectra of DHSO interlinked Ac-SR8 at charge state [α-α]5+ (m/z 439.2117⁵⁺).

FIG. 3C shows MS³ spectra of MS² fragment ion αA (m/z 536.27²⁺).

FIG. 3D shows MS³ spectra of MS² fragment ion β_(T) (m/z 552.26²⁺), which were detected in FIG. 3A.

FIG. 4A-FIG. 4D show MS^(n) analysis of a representative DHSO inter-linked Myoglobin peptide.

FIG. 4A shows MS spectrum of a DHSO inter-linked myoglobin peptide α-β (m/z 517.2703⁵⁺).

FIG. 4B shows MS² spectrum of the cross-linked peptide detected in FIG. 4A.

FIG. 4C shows MS³ spectra of MS² fragment ion α_(A) (m/z 429.74²⁺).

FIG. 4D shows MS³ spectra of MS² fragment ion β_(T) (m/z 569.63³⁺).

FIG. 5A-FIG. 5F show myoglobin cross-link maps.

FIG. 5A shows myoglobin linear sequence showing locations of the 8 α-helices (α1-α8) and 3₁₀ helix.

FIG. 5B shows DHSO cross-link map on myoglobin linear sequence.

FIG. 5C shows DSSO cross-link map on myoglobin linear sequence.

FIG. 5D shows DHSO cross-link map on myoglobin crystal structure (PDB: 1DWR).

FIG. 5E shows DSSO cross-link map on myoglobin crystal structure (PDB: 1DWR).

FIG. 5F shows a graph of the distribution plot of identified linkages versus their spatial distances between D|E-D|E for DHSO (black bars) or K-K for DSSO (white bars) in myoglobin structure.

FIG. 6A-FIG. 6D show BSA cross-link maps on its linear sequence.

FIG. 6A shows DHSO cross-link map.

FIG. 6B shows ADH cross-link map.

FIG. 6C shows PDH cross-link map.

FIG. 6D shows DSSO cross-link map.

FIG. 7A-FIG. 7D show characteristic MS² fragmentation patterns for DHSO cross-linked peptides.

FIG. 7A shows the scheme of peptide cross-linking by DHSO in the presence of DMTMM.

FIG. 7B shows MS² fragmentation of DHSO intra-linked peptide α_(intra).

FIG. 7C shows dead-end modified peptide α_(DN).

FIG. 7D shows the conversion scheme of α_(S) to α_(T).

FIG. 8 shows an embodiment of the general XL-MS workflow for the identification of cross-linked DHSO peptides from proteins.

FIG. 9A-FIG. 9D show MS^(n) analysis of a representative DHSO inter-linked BSA peptide.

FIG. 9A shows MS spectrum of a DHSO interlinked BSA peptide α-β (m/z 692.8475⁴⁺).

FIG. 9B shows MS² spectrum of the cross-linked peptide detected in FIG. 9A.

FIG. 9C shows MS³ spectra of MS² fragment ions α_(A) (m/z 616.34²⁺).

FIG. 9D shows MS³ spectra of MS² fragment ions β_(T) (m/z 760.36²⁺).

FIG. 10A-FIG. 10C show MS^(n) analysis of a DHSO intra-linked myoglobin peptide.

FIG. 10A shows MS spectrum of a DHSO intra-linked myoglobin peptide α_(intra) (m/z 518.5275⁴⁺).

FIG. 10B shows MS² spectrum of the intra-linked peptide in FIG. 10A.

FIG. 10C shows MS³ spectrum of the MS² fragment ion α_(A+T) (m/z 514.02⁴⁺).

FIG. 11A-FIG. 11D show MS^(n) analysis of a DHSO dead-end modified myoglobin peptide.

FIG. 11A shows MS spectrum of a DHSO dead-end modified myoglobin peptide α_(DN) (604.3095³⁺).

FIG. 11B shows MS² spectrum of the parent ion detected in FIG. 11A.

FIG. 11C shows MS³ spectra of MS² fragment ions α_(A) (m/z 559.30³⁺).

FIG. 11D shows MS³ spectra of MS² fragment ions α_(T) (m/z 569.96³⁺).

FIG. 12A shows BSA cross-link maps on its crystal structure (PDB: 4F5S) with DHSO crosslink map (dotted lines).

FIG. 12B shows BSA cross-link maps on its crystal structure (PDB: 4F5S) with DSSO crosslink map (dotted lines).

FIG. 12C The distribution plot of identified linkages versus their spatial distances between DIE-DIE for DHSO (gray) or K-K for DSSO (black) in BSA structure.

DETAILED DESCRIPTION Introduction

The majority of proteins exert their functions in the form of protein complexes. These macromolecular assemblies and their protein-protein interactions play critical roles in regulating integral biological processes. As a result, perturbations of endogenous protein-protein interactions can result in deleterious effects on cellular activities. Structural analyses of these complexes by traditional biophysical structural techniques such as x-ray crystallography and nuclear magnetic resonance (NMR) are frequently utilized to elucidate their topologies. Unfortunately, many large and heterogeneous complexes are refractory to such methods, ushering the development of new hybrid structural strategies.

Chemical cross-linking coupled with mass spectrometry (XL-MS) has emerged as a powerful and popular approach for delineating the protein interactions within large multi-subunit protein complexes^(1,2). Moreover, cross-linking can capture temporal protein interactions by forming covalent bonds between proximal amino acid residues, effectively freezing transient interactions and providing information on the identities and spatial orientations of interacting proteins simultaneously. These linkages are then utilized as distance constraints to facilitate three-dimensional modeling of protein complexes by refining existing high-resolution protein structures or complementing lower resolution biophysical structural techniques (e.g. cryo-electron microscopy) in order to position individual protein subunits or interacting regions³⁻⁹.

Elucidating the structure of protein complexes is paramount for understanding their functions. Recent technological advancements in cross-linking mass spectrometry (XL-MS) have made it a powerful methodology for defining protein-protein interactions and elucidating architectures of large protein complexes. However, one of the inherent challenges in MS analysis of cross-linked peptides is their unambiguous identification. One of the major challenges in conventional XL-MS studies is unambiguous identification of cross-linked peptides, due to difficulty in interpreting convoluted tandem mass spectra resulting from the fragmentation of covalently linked peptides.

To this end, various types of cleavable cross-linkers have been developed to facilitate and simplify MS identification of cross-linked peptides, such as UV-photocleavable¹⁰, chemical-cleavable¹¹, and MS-cleavable reagents¹²⁻¹⁸. Of these, MS-cleavable cross-linkers appear to be the most attractive option due to their capability to improve the identification of cross-linked peptides by separating inter-linked peptides during MS analysis.

To facilitate this process, the inventors have previously developed a series of amine-reactive sulfoxide-containing MS-cleavable cross-linkers. These MS-cleavable reagents have allowed the inventors to establish a common robust XL-MS workflow (for example, FIG. 8) that enables fast and accurate identification of cross-linked peptides using multistage tandem mass spectrometry (MS^(n)). Although amine reactive reagents targeting lysine residues are successful, it remains difficult to characterize protein interaction interfaces with little or no lysine residues. In recent years, the inventors have developed a new class of MS-cleavable cross-linking reagents containing sulfoxide group(s) within their spacer regions, i.e., disuccinimidyl sulfoxide (DSSO)¹⁶, dimethyl disuccinimidyl sulfoxide (DMDSSO)¹⁷, Azide-tagged Acid-cleavable DiSuccinimidyl BisSulfoxide (Azide-A-DSBSO)¹⁸ (FIG. 1A-FIG. 1C).

These MS-cleavable reagents contain symmetric MS-labile C—S bonds (adjacent to the sulfoxide group) that can be selectively and preferentially fragmented prior to peptide backbone cleavage during collision induced dissociation (CID)¹⁶⁻¹⁸. Such fragmentation is robust and predictable, occurring independently of cross-linking types, peptide charges, and sequences. Ultimately this unique feature enables simplified and unambiguous identification of cross-linked peptides by MS^(n) analysis and conventional database searching tools¹⁶⁻¹⁸. In some embodiments, the inventors' sulfoxide-containing, MS-cleavable cross-linkers have been successfully applied to not only define protein-protein interactions and elucidate structures of protein complexes in vitro^(5,9,16,27) and in vivo¹⁸, but also quantify structural dynamics of protein complexes^(17,19).

In current XL-MS studies, amine-reactive reagents targeting lysine residues are the most widely used compounds for successful elucidation of protein structures. This is due to their effective and specific cross-linking chemistry, as well as frequent occurrence of lysine residues in protein sequences and at surface-exposed regions of protein structures. However, it remains challenging to characterize protein interaction interfaces with little to no lysine residues.

Therefore, there is a necessity for the development of additional cross-linking chemistries in order to increase the coverage of structural information obtainable from XL-MS experiments, particularly in systems where protein interacting regions are refractive to amine-specific cross-linking. Although several types of cross-linkers targeting other amino acids (e.g. sulfhydryl-reactive and non-specific photo-reactive) reagents are commercially available, their applications in studying protein-protein interactions thus far are very limited.

For instance, sulfhydryl-reactive cross-linking reagents carrying functional groups with specific chemistries targeting cysteine residues have not been widely adopted, most likely owing to the relatively low occurrence of cysteine residues and their participation in forming disulfide bonds in protein structures. In comparison, although photochemical cross-linking reagents can improve the coverage of protein interaction contacts by reacting with any amino acids non-specifically²⁰, the resulting cross-linked products are often unpredictable, thus making their unambiguous MS identification even more difficult. In addition, such non-specific cross-linking has higher chance of introducing more non-specific interactions.

Therefore, a specific cross-linking chemistry targeting amino acid residues abundant at protein interaction sites would be ideal for complementing lysine targeting cross-linkers. While hydrophobic amino acid residues often constitute the cores, charged hydrophilic residues such as lysine, arginine, aspartic acid (Asp), and glutamic acid (Glu) often occupy surface-exposed regions of protein complexes, making them ideal targets for mapping protein interactions.

A recent study by Leitner et al. have demonstrated the feasibility of an acidic residue-specific cross-linking chemistry to study protein interactions using non-cleavable homobifunctional dihydrazide cross-linkers in conjunction with the coupling reagent 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM)²². This methodology is an improvement on the acidic residue cross-linking chemistry involving the coupling reagent 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide hydrochloride (EDC) that requires the cross-linking reaction to occur at a pH of 5.5²³.

In comparison, DMTMM coupling with dihydrazide cross-linkers is compatible with proteins at neutral pH (7.0-7.5) and therefore better suited for studying the structures of proteins and protein complexes under physiological conditions. However, this cross-linking strategy remains susceptible to the challenges associated with traditional cross-linking reagents in unambiguously identifying cross-linked peptides and their linkage sites. Due to the increased prevalence of Asp and Glu in protein sequences, the accurate and unambiguous identification of peptides containing non-cleavable dihydrazide cross-linked acidic residues would be intrinsically more complicated than the identification of lysine cross-linked peptides.

Due to the lack of proper reagents targeting other residues, it remains challenging to characterize protein interaction interfaces with little or no lysine residues. Similar to lysine residues, acidic residues are abundant and often present at protein contact regions.

To expand the coverage of interaction regions by XL-MS studies, in some embodiments herein, a new acidic residue reactive and sulfoxide containing MS-cleavable cross-linker, 3,3′-sulfinyldi(propanehydrazide) (DHSO) is provided. The development of DHSO cross-linker not only enlarges the range of the application of XL-MS approaches, but also further demonstrates the robustness and applicability of sulfoxide-based MS cleavability in conjunction with various reactive chemistries. In comparison to existing non-cleavable dihydrazides, DHSO contains robust MS-cleavable bonds that enable simplified and unambiguous identification of DHSO cross-linked peptides.

To simplify MS analysis and facilitate the identification of acidic residue cross-linked peptides, a sulfoxide-containing, MS-cleavable, acidic residue-specific homobifunctional cross-linking reagent, dihydrazide sulfoxide (DHSO, a.k.a. 3,3′-sulfinyldi(propanehydrazide)) has been developed as disclosed herein. This reagent adopts the same MS-labile sulfoxide chemistry as the inventors' previously developed amine-reactive, MS-cleavable cross-linkers (i.e. DSSO, DMDSSO and Azide-A-DSBSO), thus enabling robust and unambiguous identification of cross-linked peptides via the same XL-MS^(n) workflow¹⁶⁻¹⁸. DHSO represents a novel class and the first generation of acidic residue-targeting cross-linking reagents with MS-cleavability.

According to a recent SwissProt database release²¹; aspartic and glutamic acids comprise roughly 12.2% of all amino acid residues, compared to the 5.8% of lysines. Therefore, acidic residues (i.e. aspartic and glutamic acids) represent high potential targets for cross-linking studies due to their abundance and prevalence at interaction interfaces. In some embodiments, the proteins and/or protein complexes targeted by DHSO comprise about 1% to about 12.5% acidic amino acid residues. In some embodiments, the proteins and/or protein complexes targeted by DHSO comprise about 0.1, 0.2, 0.3, 0.4, 0.5, 1, 2.5, 5, 7.5, 10, 12.5, 15, 17.5, 20 22.5 or 25% acidic amino acid residues, or value within a range defined by any two of the aforementioned values.

In some embodiments, DHSO expands the coverage of protein interaction regions, disclosed herein is the development of a new acidic acid residue targeting sulfoxide-containing MS-cleavable homobifunctional cross-linker, dihydrazide sulfoxide (DHSO). In some embodiments, a novel acidic residue reactive and sulfoxide-containing MS-cleavable homobifunctional cross-linker for probing protein-protein interactions is provided²⁸.

In some embodiments, DHSO cross-linked peptides display the same predictable and characteristic fragmentation pattern during collision induced dissociation as those amine-reactive sulfoxide-containing MS-cleavable cross-linked peptides, thus permitting their simplified analysis and unambiguous identification by MS^(n). In some embodiments, proteins are cross-linked by DHSO (3,3′-sulfinyldi(propanehydrazide)) in the presence of coupling reagent 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM), forming covalent linkages between proximal and interacting proteins through aspartic acid and/or glutamic acid linkages. The resulting cross-linked peptides were analyzed by multistage tandem mass spectrometry (MS^(n)). In some embodiments, other acidic amino acids besides Asp and Glu are included within the scope of this disclosure.

Additionally, in some embodiments, DHSO can provide complimentary data to amine reactive reagents. The present disclosure expands the range of the application of XL-MS approaches, but also further demonstrates the robustness and applicability of sulfoxide-based MS-cleavability in conjunction with various reactive chemistries.

In some embodiments, an integrated Xl-MS platform to determine composition-dependent conformational changes of proteins and/or protein complexes is provided. In some embodiments, it is expected that DHSO-based XL-MS strategies will become an invaluable tool in providing a complementary subset of cross-linking data towards a comprehensive structural elucidation of protein complexes by XL-MS.

Design and Synthesis of a Novel Acidic Residue-Targeting Sulfoxide-Containing MS-Cleavable Cross-Linker

In order to facilitate facile and accurate identification of acidic residue cross-linked peptides, it was aimed to develop a novel MS-cleavable cross-linking reagent specific to Asp and Glu residues. In some embodiments, this required the incorporation of a functional group with robust MS-inducible cleavage sites located in the spacer region of the cross-linker. Previously, we successfully developed a novel class of amine-reactive, sulfoxide-containing MS-cleavable cross-linkers, i.e., DSSO¹⁶, DMDSSO¹⁷ and Azide-A-DSBSO¹⁸ (FIG. 1A-FIG. 1C). The C—S bonds adjacent to the sulfoxide group(s) in these reagents have proven to be reliable labile bonds that fragment selectively and preferentially to the breakage of peptide backbone during collision-induced dissociation. Additionally, such fragmentation is predictable and occurs independently of peptide charge and sequence. These unique features facilitate the simplified analysis of sulfoxide-containing cross-linked peptides and their unambiguous identification by MS^(n 16-18). Following the success of the inventors' MS-cleavable, amine-reactive cross-linkers, a novel acidic residue-reactive, MS-cleavable homobifunctional dihydrazide cross-linker incorporating a sulfoxide group in the spacer region, i.e., 3,3′-sulfinyldi(propanehydrazide (a.k.a., dihydrazide sulfoxide (DHSO)) was designed. DHSO is synthesized from DSSO with two additional synthesis steps (FIG. 1D).

It will be understood by one of ordinary skill in the art that the parameters and conditions provided herein are exemplary and are not limiting in any way. The parameters and conditions of synthesis can be adjusted by one of ordinary skill in the art based on need and the scale of synthesis desired. In some embodiments, disuccinimidyl sulfoxide (DSSO) is synthesized as previously published¹⁶. The 2-step synthesis scheme for DHSO from DSSO is depicted in FIG. 1D. In some embodiments, the parameters and conditions of DHSO synthesis can be adjusted by one of ordinary skill in the art as desired. In some embodiments, tert-butyl carbazate (1.10 g, 8.32 mmol) is added to DSSO (1.41 g, 4.16 mmol) in dichloromethane (DCM) (50 mL). In some embodiments, the amount of tert-butyl carbazate ranges from about 0.22 g to about 5.5 g. In some embodiments, the amount of DSSO ranges from about 0.282 g to about 7.05 g. The resulting yellow solution is stirred at room temperature for 12 h. In some embodiments, the yellow solution is stirred at room temperature for about 2.4 h to 60 h. Thereafter, trifluoroacetic acid (2.20 mL, 28.7 mmol) is added. In some embodiments, trifluoroacetic acid volume ranges from about 0.44 mL to about 11 mL. The resulting orange solution is stirred for 72 h before removing the solvent in vacuo. In some embodiments, the orange solution is stirred at room temperature for about 14.4 h to about 360 h. The resulting orange oil is dissolved in methanol, and then triethylamine is added. The resulting mixture is stirred for 20 min, resulting in the precipitation of a white solid. In some embodiments, the mixture is stirred for about 4 min to about 100 min. The solid is collected via centrifuge, and then stirred with fresh methanol for 20 min. In some embodiments, the solid is collected by other methods know in the art such as decantation, filtration, etc. In some embodiments, the solid is stirred and washed with fresh methanol for about 4 min to about 100 min. The solid is collected via centrifuge again, and this process of stirring with fresh methanol is repeated another 2 times. In some embodiments, the process is repeated about 1 to about 20 times. The isolated white solid is dried in vacuo to obtain DHSO. In some embodiments, the white solid is dried by other methods known in the art such as drying at room temperature, heating in an oven, etc. In some embodiments, the amount of DHSO obtained is about 0.375 g. In some embodiments, the amount of DHSO obtained ranges from about 0.075 g to about 1.875 g. It will be understood by one of ordinary skill in the art that room temperature refers to a temperature of about 70° F. (or about 21° C.). In some embodiments, room temperature can be about 65° F. to about 80° F. (or about 18° C. to about 27° C.).

As shown below and in FIG. 1D, DHSO is composed of two hydrazide reactive groups separated by a 12.5 Å long spacer arm containing two symmetrical C—S cleavable bonds flanking a central sulfoxide.

In comparison to existing cross-linkers for XL-MS studies^(16-18,22), DHSO carries a linker length well suited for defining interaction interfaces between and within protein complexes.

In some embodiments, DHSO was synthesized and characterized with synthetic peptides and simple model proteins to confirm the features of its design. In some embodiments, the MS cleavability was validated by MS^(n) analysis. In some embodiments, the DHSO based cross-linking mass spectrometry workflow provided herein has proven to be as effective as those used for inventors' previously developed sulfoxide-containing amine reactive MS-cleavable cross-linkers, i.e. DSSO, DMDSSO and Azide-A-DSBSO. In some embodiments, the DHSO-based cross-linking mass spectrometry workflow can complement lysine reactive cross-linkers for in vitro and in vivo cross-linking studies.

In some embodiments, DHSO represents the first MS-cleavable cross-linker targeting acidic residues. DHSO comprises robust MS-cleavable bonds, i.e. C—S bonds adjacent to the center sulfoxide group, which enables fast, effective, and accurate identification of DHSO cross-linked peptides. In some embodiments, based on analysis of the standard protein BSA, many more cross-linked peptides were identified with DHSO in comparison to existing non-cleavable dihydrazide reagents. In some embodiments, DHSO cross-linked peptides possess the same characteristics distinctive to peptides cross-linked by other sulfoxide-containing amine reactive cross-linkers previously developed by the inventors.

In some embodiments, the unique features of DHSO significantly facilitate cross-linking studies targeting acidic residues, which has been difficult in the past due to the large number of D/E present in protein sequences and complexity of their resulting cross-linked peptides for MS analysis. Comparison of DHSO and DSSO confirms the need of expanding the coverage of protein interactions using cross-linkers targeting different residues, especially when the distribution of specific amino acids is uneven. Thus, in some embodiments, development of DHSO crosslinking will aid in the goal of defining protein-protein interactions at the global scale and understanding the structural dynamics of protein complexes and their mechanistic functions in cells. In some embodiments, the DHSO-based workflow provided herein can be applied to cross-linking studies of simple and complex protein mixtures in vitro and in vivo.

CID Fragmentation Patterns of DHSO Cross-Linked Peptides

A previous study showed that the reaction of hydrazide cross-linkers with acidic residues first requires activation of the terminal carboxyl groups of Asp (D) and Glu (E) side chains or protein C-termimi²². The coupling reagent DMTMM was demonstrated to be effective in activating carboxylic acid groups to form a reactive intermediate that can be displaced by nucleophilic attack from hydrazides under physiological pH²² (FIG. 2A). Therefore, in the present disclosure, DMTMM was adopted as the activating agent for DHSO cross-linking of acidic residues. Similar to lysine-reactive cross-linkers, DHSO cross-linking would result in the formation of three types of cross-linked peptides: dead-end (type 0), intra-link (type 1), and inter-link (type 2) modified peptides, among which inter-linked peptides provide the most informative data on the relative spatial orientation of cross-linked acidic residues²⁶.

Since all of the MS-cleavable, homobifunctional NHS esters we previously developed display the same characteristic fragmentation patterns in MS² due to the cleavage of either of the two symmetric CID-cleavable C—S bonds adjacent to the sulfoxide functional group^(16-18,) it is expected that DHSO cross-linked peptides will behave similarly during MS^(n) analysis even though their residue-targeting functional groups are different.

To elaborate this process, FIG. 2B-FIG. 2D illustrates the predicted MS² fragmentation patterns of DHSO cross-linked peptides. For a DHSO inter-linked peptide α-β, the cleavage of one of the two symmetric C—S bonds would result in one of the two predicted peptide fragment pairs (i.e. α_(A)/β_(S) or α_(S)/β_(A)) (FIG. 2B). The resulting α and β peptide fragments are modified by complementary cross-linker remnant moieties, i.e., alkene (A) or sulfenic acid (S). Therefore, a total of four individual MS² fragment ion peaks are expected for a DHSO cross-linked heterodimer, representing the two predictable pairs of peptide fragments. These single peptide chains are then be subjected to MS³ analysis, permitting unambiguous identification of each cross-linked peptide sequence and cross-linking site respectively. For a DHSO intra-linked peptide αintra in which proximal D or E amino acid residues are cross-linked within the same peptide, one peptide fragment (i.e. α_(A+S)) is expected in MS² analysis (FIG. 2C). In reality, this particular ion would represent two populations of ion species that have identical peptide sequences and m/z values, but transposed DHSO remnant-modified acidic residues. Lastly, a DHSO dead-end modified peptide αDN would potentially fragment into two ion species during MS² analysis. Depending on the position of the cleaved C—S bond, α_(A) or α_(S) fragments would be observed, resulting in a pair of daughter ions detected during MS² (FIG. 2D). In all three scenarios, it is noted that the sulfenic acid moiety often undergo dehydration to become a more stable and dominant unsaturated thiol moiety (i.e. T) (FIG. 2E), a conversion that is commonly observed in amine reactive, sulfoxide containing MS-cleavable cross-linked peptides and that does not complicate data analysis as previously demonstrated¹⁶⁻¹⁸. Of note, S* (sulfenic acid moiety) can be converted to the more stable unsaturated thiol moiety (T) via water loss (FIG. 2E). The distinct MS² fragmentation patterns of sulfoxide-containing MS-cleavable cross-linked peptides result in predictable mass relationships between parent ions and their respective fragments. These mass relationships are utilized as an additional verification of cross-linked peptide identification at the MS² level. Along with mass fingerprinting by MS¹ and peptide sequencing by MS³, three lines of evidence can be obtained and integrated to accurately identify DHSO cross-linked peptides using the identical MS^(n) workflow previously developed for the analysis of DSSO, DMDSSO and DSBSO cross-linked peptides¹⁶⁻¹⁸.

Characterization of DHSO Cross-Linked Model Peptides by MS^(n) Analysis

Despite the similarities in spacer arm structure to DSSO, it is necessary to verify whether DHSO cross-linked peptides indeed fragment the same way as described above during MS^(n) analysis (FIG. 2). Initial characterization of DHSO was performed on a synthetic peptide containing a single acidic residue, Ac-SR8 (Ac-SAKAYEHR (SEQ ID NO: 178)). As a result, inter-linked Ac-SR8 dimer was detected as quadruply-charged (m/z 548.7623⁴⁺) and quintuply-charged (m/z 439.2117⁵⁺) ion species, respectively. MS² analysis of the quadruply-charged parent ion produced a pair of dominant fragment ions αA/αT (m/z 536.272+/552.26²⁺), demonstrating effective physical separation of the two cross-linked peptides as expected (FIG. 3A). Similarly, MS² analysis of the quintuply-charged parent ion (m/z 439.2117⁵⁺) yielded a single pair of dominant fragment ions αA/αT (m/z 357.85³⁺1552.26²⁺) as well (FIG. 3B), demonstrating the characteristic fragmentation independent of peptide charges as expected. Subsequent MS³ analyses of the α_(A) (m/z 536.27²⁺) fragment ion (FIG. 3C) resulted in series of y and b ions that unambiguously confirmed the peptide sequence as Ac-SAKAYEAHR (SEQ ID NO: 179) in which the glutamic acid was modified with a DHSO alkene (A) moiety. Similarly, MS³ spectrum of the α_(T) fragment (m/z 552.26²⁺) determined its identity as Ac-SAKAYETHR (SEQ ID NO: 180), in which the glutamic acid was modified with a DHSO unsaturated thiol (T) moiety (FIG. 3D). Therefore, the cross-linked peptide was identified as [Ac-SAKAYE⁶HR] (SEQ ID NO: 181) inter-linked to [Ac-SAKAYE⁶HR] (SEQ ID NO: 181) through E6 in both peptides. This result indicates that DHSO inter-linked peptides indeed display the same characteristic MS^(n) fragmentation as sulfoxide-containing lysine inter-linked peptides, and can be analyzed using the same data analysis workflow as previously described¹⁶⁻¹⁸.

Characterization of DHSO Cross-Linked Model Proteins by MS^(n) Analysis

To evaluate the capability of DHSO for protein cross-linking in vitro, equine myoglobin and bovine serum albumin (BSA) were used as model proteins. These two proteins contain above-average acidic residue content (16.3% and 13.6%, respectively), making them well suited for evaluating DHSO cross-linking. In addition, BSA was employed previously for acidic residue cross-linking by non-cleavable dihydrazides²². To identify DHSO cross-linked peptides in myoglobin and BSA, in-gel digestion of gel-separated DHSO cross-linked proteins or in solution digestion of DHSO cross-linked proteins followed by peptide SEC was performed as illustrated (FIG. 7A-FIG. 7D). Of note, S* (sulfenic acid moiety) can be converted to the more stable unsaturated thiol moiety (T) via water loss as shown in FIG. 7D. The resulting peptides were subjected to LC-MS^(n) analysis. FIG. 4A displays the MS¹ spectra for an exemplary inter-link (α-β) (m/z 517.2703⁴⁺) identified from equine myoglobin. Its MS² analysis resulted in the detection of the two characteristic peptide fragment pairs α_(A)/β_(T) (m/z 429.742+/569.63²⁺) and α_(T)/β_(A) (m/z 445.72²⁺/559.64²⁺) (FIG. 4B). MS³ analysis of α_(A) (m/z 429.74²⁺) (FIG. 4C) determined its sequence as ASE_(A)DLKK (SEQ ID NO: 182), in which the glutamic acid residue at the 3rd position from the N-terminus was modified with an alkene moiety. MS³ analysis of β_(T) (m/z 569.63²⁺) identified its sequence as VEAD_(T)IAGHGQEVLIR (SEQ ID NO: 183), with the aspartic acid residue at the 4th position from the N-terminus carrying an unsaturated thiol moiety (FIG. 4D). Together with MS¹ and MS² data, the inter-linked peptides were unambiguously identified as [¹⁸VEAD_(T)IAGHGQEVLIR³² (SEQ ID NO: 183) cross-linked to ⁵⁸ASE_(A)DLKK⁶⁴] (SEQ ID NO: 182), describing an inter-link formed between D21 and E60 of equine myoglobin.

FIG. 9A-FIG. 9D display MS^(n) analysis of a selected DHSO inter-linked BSA peptide, which was measured as a quadruply-charged ion (m/z 692.8475⁴⁺) in MS¹ (FIG. 9A). Its MS² spectrum revealed two pairs of complementary MS² fragment ions α_(A)/β_(T) (m/z 616.32²⁺/760.36²⁺) and α_(T)/β_(A) (m/z 632.32²⁺/744.37²⁺), further demonstrating the robust fragmentation expected of DHSO cross-linked peptides (FIG. 9B). Together with MS³ sequencing of MS² fragments αA (m/z 616.32²⁺) and βT (m/z 760.36²⁺) (FIG. 9C and FIG. 9D), this DHSO inter-linked peptide was unambiguously identified as [⁶⁶LVNE_(A)LTEFAK⁷⁵ (SEQ ID NO: 184) inter-linked to ⁸⁹SLHTLFGDE_(T)LCK¹⁰⁰] (SEQ ID NO: 185), in which residue E69 cross-linked to residue E97 in BSA.

In addition to inter-linked peptides, intra-linked peptides were also observed as a result of DHSO cross-linking in the model proteins. For example, MS² fragmentation of an intra-linked myoglobin peptide (FIG. 10A-FIG. 10C) produced a single fragment ion peak α_(A+T) (m/z 514.02⁴⁺) with 18 Da less than the parent ion, consistent with the expected fragmentation pattern described in FIG. 2B following dehydration of the sulfenic acid moiety to an unsaturated thiol moiety. Analysis of the α_(A+T) ion in subsequent MS³ analysis (FIG. 10C) yielded series of b and y ions permitting the unambiguous identification of two peptides sharing identical sequences but transposed alkene and unsaturated thiol moieties: ¹⁰⁵YLE_(A)FISD_(T)AIIHVLHSK119 (SEQ ID NO: 186) and ⁰⁵YLE_(T)FISD_(A)AIIHVLHSK¹¹⁹ (SEQ ID NO: 187), indicating an intra-link between residues E106 and D110.

Moreover, MS² fragmentation of a myoglobin dead-end modified peptide (m/z 604.3095³⁺) resulted in the detection of a single pair of fragment ions αA/αT (m/z 559.30³⁺/569.96³⁺), consistent with the expected fragmentation pattern described in FIG. 2C (FIG. 11C). These fragment ions were then identified in subsequent MS³ analysis as ¹⁸VE_(A)ADIAGHGQEVLIR32 (SEQ ID NO: 188) and ¹⁸VE_(T)ADIAGHGQEVLIR32 (SEQ ID NO: 189), representing a dead-end cross-link located on E19 of myoglobin (FIG. 4C and FIG. 4D). These results further demonstrate the robust MS^(n) analysis of DHSO cross-linked peptides.

In total, LC-MS^(n) analysis of DHSO cross-linked myoglobin identified a total of 33 unique inter-linked peptides, representing 32 unique E|D-E|D linkages (Table 1).

TABLE 1 Summary of DHSO Inter-Linked Myoglobin Peptides Identified by LC MS^(n) AA MS Δ Mod. MS2 Distance # Peptide Seq location m/z z (PPM) Position m/z z (Cα-Cα)  1 ALELFR A135-R140 421.2357 4  0.186 E_(T)137 424.726 2 — (SEQ ID NO: 1) ALELFR A135-R140 E_(A)137 408.740 2 (SEQ ID NO: 2)  2 HPGDFGADAQGAM(ox)TK H120-K134 613.7913 4 -2.547 D_(A)123|D_(A)127 793.885 2 21.6Å|15.4Å (SEQ ID NO: 3)| (SEQ ID NO: 4) ALELFR A135-R140 E_(A)137 408.740 2 (SEQ ID NO: 2)  3 HPGDFGADAQGAMTK H120-K134 609.7944 4   .435 D_(A)127 785.859 2 15.4Å (SEQ ID NO: 5) ALELFR A135-R140 E_(A)137 408.741 2 (SEQ ID NO: 2)  4 KKGHHEAELKPLAQSHATK K79-K97 737.1354 4  1.188 E_(A)84|E_(A)86 726.736 3 14.1Å|16.8Å (SEQ ID NO: 9)| (SEQ ID NO: 10) ELGFQG E149-G154 E_(A)149 718.335 1 (SEQ ID NO: 14)  5 LFTGHPETLEK L33-K43 441.8353 5 -0.494 E_(T)42 457.895 3 24.4Å (SEQ ID NO: 15) ALELFR A135-R140 E_(A)137 408.740 2 (SEQ ID NO: 2)  6 LFTGHPETLEK L33-K43 441.8364 5  1.996 E_(A)39|E_(A)42 447.239 3 22.0Å|24.4Å (SEQ ID NO: 16)| (SEQ ID NO: 17) ALELFR A135-R140 E_(A)137 408.740 2 (SEQ ID NO: 2)  7 LFTGHPETLEK L33-K43 437.0133 5  2.713 E_(T)39|E_(T)42 437.895 3 14.6 Å|14.7Å (SEQ ID NO: 18)| (SEQ ID NO: 15) TEAEM(ox)K T52-K57 E_(A)53 396.682 2 (SEQ ID NO: 19)  8 LFTGHPETLEK L33-K43 542.0137 4 -1.652 E_(T)42 686.342 2 14.7Å|15.4Å (SEQ ID NO: 15) TEAEMK T52-K57 E_(A)53|E_(A)55 388.685 2 (SEQ ID NO: 21)| (SEQ ID NO: 22)  9 LFTGHPETLEK L33-K43 546.2628 4 -3.499 E_(T)39 686.341 2 14.6Å (SEQ ID NO: 18) TEAEM(ox)K T52-K57 E_(A)53 396.682 2 (SEQ ID NO: 19) 10 LFTGHPETLEK L33-K43 546.0119 4 -2.068 E_(T)42 686.341 2 14.7Å (SEQ ID NO: 15) TEAEM(ox)K T52-K57 E_(A)53 396.682 2 (SEQ ID NO: 19) 11 LFTGHPETLEK L33-K43 433.8133 5  0.389 E_(T)42 457.896 3 15.4Å (SEQ ID NO: 15) TEAEMK T52-K57 E_(A)55 388.685 2 (SEQ ID NO: 22) 12 LFTGHPETLEK L33-K43 546.4794 5 -3.081 E_(T)39|E_(T)42 686.340 2 — (SEQ ID NO: 18)| (SEQ ID NO: 15) LFTGHPETLEK L33-K43 E_(A)39|E_(A)42 447.238 3 (SEQ ID NO: 16)| (SEQ ID NO: 17) 13 LFTGHPETLEK L33-K43 683.1003 4 -0.109 E_(A)39|E_(A)42 670.355 2 — (SEQ ID NO: 16)| (SEQ ID NO: 17) LFTGHPETLEK L33-K43 E_(T)42 686.339 2 (SEQ ID NO: 15) 14 LFTGHPETLEK L33-K43 682.8481 4 -2.906 E_(A)42 670.354 2 — (SEQ ID NO: 17) LFTGHPETLEK L33-K43 E_(T)42 686.340 2 (SEQ ID NO: 15) 15 LFTGHPETLEK L33-K43 683.3470 4 -6.165 E_(A)39 670.355 2 — (SEQ ID NO: 16) LFTGHPETLEK L33-K43 E_(T)42 686.339 2 (SEQ ID NO: 15) 16 TEAEM(ox)K T52-K57 415.2066 4  0.561 E_(A)53 396.681 2 28.0Å (SEQ ID NO: 9) ALELFR A135-R140 E_(T)137 424.725 2 (SEQ ID NO: 1) 17 TEAEMK T52-K57 401.1799 4  0.611 E_(A)53 388.684 2 — (SEQ ID NO: 21) TEAEMK T52-K57 E_(T)53 404.67  2 (SEQ ID NO: 23) 18 TEAEMK T52-K57 421.7067 4  1.031 E_(A)53|E_(A)55 388.684 2 13.0Å|9.8Å (SEQ ID NO: 21)| 14.0Å|11.0Å (SEQ ID NO: 22) ASEDLK A58-K63 E_(T)60|D_(T)61 445.722 2 (SEQ ID NO: 24)| (SEQ ID NO: 25) 19 TEAEMK T52-K57 421.7063 4 -0.392 E_(A)53 388.684 2 13.0Å|9.8Å (SEQ ID NO: 21) ASEDLKK A58-K64 E_(T)60|D_(T)61 445.724 2 (SEQ ID NO: 24)| (SEQ ID NO: 25) 20 TEAEMK T52-K57 599.7642 4 -3.253 E_(A)53 388.684 2 32.1Å (SEQ ID NO: 21) HPGDFGADAQGAMTK H120-K134 D_(T)127 801.844 2 (SEQ ID NO: 6) 21 VEADIAGHGQEVLIR V18-R32 500.8524 5 18.47  E_(T)28 569.627 3 10.5Å (SEQ ID NO: 27) TEAEMK T52-K57 E_(A)53 388.685 2 (SEQ ID NO: 21) 22 VEADIAGHGQEVLIR V18-R32 629.8124 4  0.950 E_(T)28 853.939 2 10.5Å|13.0Å (SEQ ID NO: 27) TEAEM(ox)K T52-K57 E_(A)53|E_(A)55 396.681 2 (SEQ ID NO: 19)| (SEQ ID NO: 20) 23 VEADIAGHGQEVLIR V18-R32 614.5657 4 -2.085 E_(A)19|D_(A)21 558.970 3 18.4Å|14.7Å (SEQ ID NO: 28)| (SEQ ID NO: 29) ASEDLK A58-K63 E_(A)60 730.374 1 (SEQ ID NO: 26) 24 VEADIAGHGQEVLIR V18-R32 614.3143 4 -2.107 E_(A)19|D_(A)21 558.969 3 18.4Å|18.8Å (SEQ ID NO: 28)| 14.7Å|16.1Å (SEQ ID NO: 29) ASEDLK A58-K62 E_(T)60|D_(T)61 730.374 1 (SEQ ID NO: 24)| (SEQ ID NO: 25) 25 VEADIAGHGQEVLIR V18-R32 517.2736 5  1.308 D_(T)21 569.627 3 14.7Å|16.1Å (SEQ ID NO: 30) ASEDLKK A58-K63 E_(T)60|D_(T)61 445.725 2 (SEQ ID NO: 24)| (SEQ ID NO: 25) 26 VEADIAGHGQEVLIR V18-R32 517.2736 5  1.308 D_(T)21 569.627 3 14.7Å (SEQ ID NO: 30) ASEDLKK A58-K64 E_(A)60 429.738 2 (SEQ ID NO: 26) 27 VEADIAGHGQEVLIR V18-R32 659.7219 5  1.861 D_(T)21 569.627 3 15.7Å (SEQ ID NO: 30) HPGDFGADAQGAMTK H120-K134 D_(A)123 785.858 2 (SEQ ID NO: 7) 28 VEADIAGHGQEVLIR V18-R32 659.7219 5  1.861 D_(T)21 569.627 3 15.7Å|22.3Å (SEQ ID NO: 30) HPGDFGADAQGAMTK H120-K134 D_(T)123|D_(T)127 801.847 2 (SEQ ID NO: 8)| (SEQ ID NO: 6) 29 VEADIAGHGQEVLIR V19-R32 659.7193 5 -2.080 E_(A)19 558.969 3 19.8Å (SEQ ID NO: 28) HPGDFGADAQGAMTK H120-K134 D_(T)127 801.843 2 (SEQ ID NO: 6) 30 VEADIAGHGQEVLIR V18-R32 504.0492 5 -2.495 D_(T)21|E_(T)28 569.627 3 20.1Å|10.5Å (SEQ ID NO: 30)| (SEQ ID NO: 27) TEAEM(ox)K T52-K57 E_(A)53 396.681 2 (SEQ ID NO: 19) 31 YLEFISDAIIHVLHSK Y104-K119 564.5054 5 -3.321 D_(T)110 662.346 3 10.1Å (SEQ ID NO: 31) ALELFR A135-R140 E_(A)137 408.739 2 (SEQ ID NO: 2) 32 YLEFISDAIIHVLHSK Y104-K119 705.3836 4  1.880 E_(A)106 651.691 3 8.4Å (SEQ ID NO: 32) ALELFR A135-R140 E_(A)137 816.475 1 (SEQ ID NO: 2) 33 YLEFISDAIIHVLHSK Y104-K119 564.5085 5  2.171 E_(T)106|D_(T)110 662.350 3 8.4Å|10.1Å (SEQ ID NO: 33)| (SEQ ID NO: 31) ALELFR A135-R140 E_(A)137 408.741 2 (SEQ ID NO: 2) Note: “I” means or; “&” means and; “T”: unsaturated thiol moiety; “A”: alkene moiety; “ox” means oxidation.

Similarly, 62 unique DHSO inter-linked BSA peptides were identified, describing 69 unique E|D-E|D linkages (Table 2).

TABLE 2 Summary of DHSO Inter-Linked BSA Peptides Identified by LC MS^(n) AA MS Δ Mod. MS2 Distance # Peptide Seq location m/z z (PPM) Position m/z z (Cα-Cα)  1 AEFVEVTK A249-K256 475.0076 4 -0.713 E_(A)250 495.765 2 16.1Å (SEQ ID NO: 34) LVTDLTK L257-K263 D_(A)260 429.257 2 (SEQ ID NO: 37)  2 AEFVEVTK A249-K256 475.0078 4 -0.292 E_(T)253 511.752 2 10.9Å (SEQ ID NO: 35) LVTDLTK L257-K263 D_(A)260 429.257 2 (SEQ ID NO: 37)  3 AEFVEVTK A249-K256 544.7503 4 -0.181 E_(A)250 495.768 2 45.1Å (SEQ ID NO: 34) QNCDQFEK Q413-K420 E_(T)419 584.731 2 (SEQ ID NO: 39)  4 AEFVEVTK A249-K256 638.5493 4 -1.936 E_(A)250|E_(A)253 495.767 2 11.5Å|11.6Å (SEQ ID NO: 34)| (SEQ ID NO: 36) YICDNQDTISSK Y286-K297 D_(T)292 772.330 2 (SEQ ID NO: 43)  5 AEFVEVTK A249-K256 638.5493 4 -1.936 E_(A)250 495.767 2 11.5Å (SEQ ID NO: 34) YICDNQDTISSK Y286-K297 D_(T)292 772.330 2 (SEQ ID NO: 43)  6 AEFVEVTK A249-K256 638.5522 4  2.606 E_(A)253 495.767 2 12.8Å|11.6Å (SEQ ID NO: 36) YICDNQDTISSK Y286-K297 D_(T)289|D_(T)292 772.331 2 (SEQ ID NO: 44)| (SEQ ID NO: 43)  7 ATEEQLK A562-K568 601.5462 4 -0.098 E_(A)564|E_(A)565 443.735 2 12.4Å|11.0Å (SEQ ID NO: 46)| (SEQ ID NO: 47) TVM(ox)ENFVAFVDK T569-K580 E_(T)572 750.354 2 (SEQ ID NO: 48)  8 ATEEQLK A562-K568 601.5485 4  1.611 E_(A)565 443.735 2 11.0Å (SEQ ID NO: 47) TVMENFVAFVDK T569-K580 E_(T)572 750.356 2 (SEQ ID NO: 49)  9 DTHKSEIAHR D25-R34 499.8580 8 -1.939 D_(T)25 431.877 3 21.2Å|20.9Å| (SEQ ID NO: 50) 21.7Å VHKECCHGDLLECADD V264-K285 D_(A)278|D_(A)279| 536.846 5 RADLAK D_(A)282 (SEQ ID NO: 54)| (SEQ ID NO: 55)| (SEQ ID NO: 56) 10 DTHKSEIAHR D25-R34 570.9825 7  4.106 D_(T)25 431.877 3 21.2Å|21.7Å (SEQ ID NO: 50) VHKECCHGDLLECADD V264-K285 D_(A)279|D_(A)282 670.808 4 RADLAK (SEQ ID NO: 55)| (SEQ ID NO: 56) 11 DTHKSEIAHR D25-R34 499.7346 8  2.110 D_(T)25|E_(T)30 431.876 3 21.2Å|20.9Å| (SEQ ID NO: 50)| 21.7Å, 12.0Å| (SEQ ID NO: 51) 13.9Å|16.7Å VHKECCHGDLLECADD V264-K285 D_(A)278|D_(A)279| 536.847 5 RADLAK D_(A)282 (SEQ ID NO: 54)| (SEQ ID NO: 55)| (SEQ ID NO: 56) 12 DTHKSEIAHR D25-R34 570.9843 8 -0.169 D_(T)25|E_(T)30 6431.876  3 21.2Å|21.7Å, (SEQ ID NO: 50)| 12.0Å|16.7Å (SEQ ID NO: 51) VHKECCHGDLLECADD V264-K285 D_(A)278|D_(A)282 536.847 5 RADLAK (SEQ ID NO: 54)| (SEQ ID NO: 56) 13 DTHKSEIAHR D25-R34 570.9797 7 -0.798 E_(T)30 647.311 2 12.0Å|13.9Å| (SEQ ID NO: 51) 16.7Å VHKECCHGDLLECADD V264-K285 D_(A)278|D_(A)279| 536.848 5 RADLAK D_(A)282 (SEQ ID NO: 54)| (SEQ ID NO: 55)| (SEQ ID NO: 56) 14 DTHKSEIAHR D25-R34 542.7869 4  0.787 D_(A)25 631.324 2 22.7Å (SEQ ID NO: 52) LVTDLTK L257-K263 D_(A)260 429.259 2 (SEQ ID NO: 37) 15 DTHKSEIAHR D25-R34 711.5669 4  0.368 D_(T)25 647.310 2 18.8Å|15.4Å (SEQ ID NO: 50) TCVADESHAGCEK T76-K88 D_(A)80|E_(A)81 766.318 2 (SEQ ID NO: 59)| (SEQ ID NO: 60) 16 DTHKSEIAHR D25-R34 711.3146 4 -1.686 D_(T)25 647.311 2 14.6Å (SEQ ID NO: 50) TCVADESHAGCEK T76-K88 E_(A)87 766.316 2 (SEQ ID NO: 61) 17 DLGEEHFK D37-K44 438.2119 4  1.889 E_(T)41 537.738 2 16.3Å|13.0Å (SEQ ID NO: 62) ADEKK A152-K156 E_(A)154 329.680 2 (SEQ ID NO: 63) 18 GLVLIAFSQYLQQCPF G45-K65 902.9565 4 -1.045 D_(A)61|E_(A)62 1280.660  2 12.4Å|13.7Å DEHVK (SEQ ID NO: 64)| (SEQ ID NO: 190) YLYEIAR Y161-R167 E_(T)164 514.256 2 (SEQ ID NO: 65) 19 HLVDEPQNLIK H402-K412 665.0706 4   .496 D_(T)405 703.368 2 13.6Å (SEQ ID NO: 67) CCTKPESER C460-R468 E_(A)465 617.770 2 (SEQ ID NO: 70) 20 HLVDEPQNLIK H402-K412 665.0706 4   .496 D_(A)405|E_(A)406 687.382 2 13.6Å|15.8Å (SEQ ID NO: 68)| (SEQ ID NO: 69) CCTKPESER C460-R468 E_(A)465 617.770 2 (SEQ ID NO: 70) 21 HLVDEPQNLIK H402-K412 532.4614 5  5.743 D_(T)405 703.367 2 13.6Å|13.1Å (SEQ ID NO: 67) CCTKPESER C460-R468 E_(A)465|E_(A)467 412.182 3 (SEQ ID NO: 70)| (SEQ ID NO: 71) 22 IETMR I205-R210 579.3010 4  1.167 E_(A)206 359.19 2 17.8Å (SEQ ID NO: 75) LGEYGFQNALIVR L421-R433 E_(T)423 790.405 2 (SEQ ID NO: 76) 23 IETMR I205-R210 587.3107 4 -2.486 E_(A)206 359.19 2 18.8Å (SEQ ID NO: 75) VPQVSTPTLVEVSR V438-R451 E_(T)448 806.428 2 (SEQ ID NO: 78) 24 LKPDPNTLCDEFK L139-K151 673.0840 4  1.077 E_(T)149 838.893 2 14.9Å (SEQ ID NO: 79) YLYEIAR Y161-R167 E_(A)164 498.269 2 (SEQ ID NO: 66) 25 LKPDPNTLCDEFKADEK L139-K155 627.5087 5 -0.588 D_(T)153|E_(T)154 707.332 3 16.7Å|15.6Å (SEQ ID NO: 81)| (SEQ ID NO: 82) YLYEIAR Y161-R167 E_(A)164 498.269 2 (SEQ ID NO: 66) 26 LVTDLTK L257-K263 897.4279 4  0.795 D_(A)260 857.511 1 9.1Å (SEQ ID NO: 37) VHKECCHGDLLECADD V264-K285 D_(A)282 894.072 3 RADLAK (SEQ ID NO: 57) 27 LVTDLTK L257-K263 513.2476 7  0.730 D_(T)260 445.245 2 7.8Å, 9.1Å (SEQ ID NO: 38) VHKECCHGDLLECADD V264-K285 D_(A)279|D_(A)282 536.848 5 RADLAK (SEQ ID NO: 55)| (SEQ ID NO: 56) 28 LVTDLTK L257-K263 513.2481 7  1.704 D_(T)260 445.245 2 11.6Å, 7.8Å, (SEQ ID NO: 38) 9.1Å VHKECCHGDLLECADD V264-K285 D_(A)278|D_(A)279| 536.848 5 RADLAK D_(A)282 (SEQ ID NO: 54)| (SEQ ID NO: 55)| (SEQ ID NO: 56) 29 LVTDLTK L257-K263 605.4583 4  1.684 D_(A)260 429.259 2 18.9Å (SEQ ID NO: 37) YICDNQDTISSK Y286-K297 D_(A)292 772.330 2 (SEQ ID NO: 45) 30 LKECCDKPLLEK L298-K309 558.8729 5 -1.841 E_(A)300|D_(A)303 534.278 3 11.2Å|15.8Å (SEQ ID NO: 83)| (SEQ ID NO: 84) SHCIAEVEK S310-K318 E_(T)317 586.764 2 (SEQ ID NO: 88) 31 LKECCDKPLLEK L298-K309 465.8965 6  0.744 E_(A)300 400.961 4 11.2Å (SEQ ID NO: 83) SHCIAEVEK S310-K318 E_(T)317 586.764 2 (SEQ ID NO: 88) 32 LKECCDKPLLEK L298-K309 698.3413 4  1.014 E_(T)300 816.901 2 11.9Å (SEQ ID NO: 85) SHCIAEVEK S310-K318 E_(A)315 570.778 2 (SEQ ID NO: 89) 33 LKECCDKPLLEK L298-K309 698.3408 4  0.298 E_(A)300 800.915 2 11.9Å|11.2Å (SEQ ID NO: 83) SHCIAEVEK S310-K318 E_(A)315|E_(A)317 570.778 2 (SEQ ID NO: 89)| (SEQ ID NO: 90) 34 LKECCDKPLLEK L298-K309 721.8368 4  0.393 E_(A)300 800.914 2 26.4Å (SEQ ID NO: 83) CCTKPESER C460-R468 E_(A)467 617.769 2 (SEQ ID NO: 71) 35 LKECCDKPLLEK L298-K309 721.8368 4  0.393 E_(T)300|D_(T)303 816.901 2 26.4Å|31.1Å (SEQ ID NO: 85)| (SEQ ID NO: 86) CCTKPESER C460-R468 E_(A)467 617.769 2 (SEQ ID NO: 71) 36 LKECCDKPLLEK L298-K309 494.8425 5 -2.149 E_(T)300|D_(T)303 544.935 3 40.9Å|41.7Å (SEQ ID NO: 85)| (SEQ ID NO: 86) NYQEAK N341-K346 E_(A)344 410.702 2 (SEQ ID NO: 92) 37 LKHLVDEPQNLIK L400-K412 580.4977 5  7.275 D_(T)405|E_(T)406 823.959 2 13.6Å|13.1Å (SEQ ID NO: 93)| (SEQ ID NO: 94) CCTKPESER C460-R468 E_(A)465|E_(A)467 412.183 3 15.8Å|15.3Å (SEQ ID NO: 70)| (SEQ ID NO: 71) 38 LCVLHEK L483-K489 518.7810 4  1.484 E_(T)488 499.749 2 10.7Å (SEQ ID NO: 96) TPVSEKVTK T490-K498 E_(T)494 544.793 2 (SEQ ID NO: 97) 39 LVNELTEFAK L66-K75 692.8486 4  -.732 E_(A)69 616.337 2 9.3Å (SEQ ID NO: 100) SLHTLFGDELCK S89-K100 E_(A)97 744.370 2 (SEQ ID NO: 102) 40 LVNELTEFAK L66-K75 692.8496 4  0.712 E_(A)72 616.337 2 13.3Å (SEQ ID NO: 101) SLHTLFGDELCK S89-K100 E_(A)97 744.370 2 (SEQ ID NO: 102) 41 LVNELTEFAK L66-K75 692.8486 4  -.732 E_(A)69 616.338 2 9.3Å|12.1Å (SEQ ID NO: 100) SLHTLFGDELCK S89-K100 D_(T)96|E_(T)97 760.357 2 (SEQ ID NO: 103)| (SEQ ID NO: 104) 42 LVNELTEFAK L66-K75 692.8502 4  1.578 E_(A)72 616.338 2 15.2Å|13.3Å (SEQ ID NO: 101) SLHTLFGDELCK S89-K100 D_(T)96|E_(T)97 760.357 2 (SEQ ID NO: 103)| (SEQ ID NO: 104) 43 NECFLSHKDDSPDLPK N123-K138 746.1718 5 -7.481 D_(A)135 657.309 3 13.5Å (SEQ ID NO: 105) KVPQVSTPTLVEVSR K437-R451 E_(A)448 854.493 2 (SEQ ID NO: 106) 44 QNCDQFEK Q413-K420 518.7349 4   .292 D_(T)416 584.731 2 16.9Å (SEQ ID NO: 39) ATEEQLK A562-K568 E_(A)564 443.736 2 (SEQ ID NO: 46) 45 QNCDQFEK Q413-K420 605.7531 4  2.460 E_(T)419 568.745 2 16.8Å (SEQ ID NO: 39) CCTKPESER C460-R468 E_(T)467 633.756 2 (SEQ ID NO: 72) 46 QNCDQFEK Q413-K420 605.7520 4   .645 E_(A)419 584.730 2 14.0Å (SEQ ID NO: 40) CCTKPESER C460-R468 E_(T)465 633.755 2 (SEQ ID NO: 73) 47 QNCDQFEK Q413-K420 605.7518 4   .314 E_(T)419 584.730 2 14.0Å|16.8Å (SEQ ID NO: 41) CCTKPESER C460-R468 E_(T)465|E_(T)467 633.756 2 (SEQ ID NO: 73)| (SEQ ID NO: 72) 48 QNCDQFEK Q413-K420 605.7521 4  0.810 D_(A)416 568.744 2 12.6Å (SEQ ID NO: 42) CCTKPESER C460-R468 E_(T)465 633.756 2 (SEQ ID NO: 73) 49 QNCDQFEK Q413-K420 484.8036 5  1.768 D_(T)416 584.730 2 12.6Å|15.7Å (SEQ ID NO: 41) CCTKPESER C460-R468 E_(A)465|E_(A)467 412.183 3 (SEQ ID NO: 70)| (SEQ ID NO: 71) 50 QNCDQFEK Q413-K420 684.0776 4  0.508 E_(A)419 568.744 2 8.2Å (SEQ ID NO: 40) LGEYGFQNALIVR L421-R433 E_(A)423 774.420 2 (SEQ ID NO: 77) 51 RPCFSALTPDETYVPK R508-K523 721.8555 4  1.171 E_(A)518 974.982 2 17.8Å (SEQ ID NO: 107) ATEEQLK A562-K568 E_(A)564 443.735 2 (SEQ ID NO: 46) 52 SEIAHR S29-R34 585.9390 6  1.516 E_(T)30 406.695 2 16.7Å (SEQ ID NO: 108) VHKECCHGDLLECADD V264-K285 D_(A)282 670.804 4 RADLAK (SEQ ID NO: 56) 53 SHCIAEVEK S310-K318 606.7698 4  2.091 E_(T)315|E_(T)317 586.763 2 18.1Å|15.8Å (SEQ ID NO: 91)| (SEQ ID NO: 88) CCTKPESER C460-R468 E_(T)467 633.755 2 (SEQ ID NO: 72) 54 SHCIAEVEK S310-K318 606.7702 4  2.751 E_(A)317 570.778 2 15.0Å (SEQ ID NO: 90) C460-R468 CCTKPESER E_(T)467 633.756 2 (SEQ ID NO: 72) 55 SHCIAEVEK S310-K318 485.6177 5  2.924 E_(T)317 586.764 2 20.3Å|15.0Å (SEQ ID NO: 88) CCTKPESER C460-R468 E_(A)465|E_(A)467 412.183 3 (SEQ ID NO: 70)| (SEQ ID NO: 71) 56 SHCIAEVEK S310-K318 503.4868 4  2.604 E_(T)315 586.764 2 35.7Å (SEQ ID NO: 91) NYQEAK  N341-K346 E_(A)344 410.701 2 (SEQ ID NO: 92) 57 VHKECCHGDLLECADD V264-K285 645.9636 6  4.495 E_(A)275|D_(A)278| 670.808 4 27.6Å|24.5Å| RADLAK D_(A)279|D_(A)282 22.8Å|21.2Å (SEQ ID NO: 57)| (SEQ ID NO: 54)| (SEQ ID NO: 55)| (SEQ ID NO: 56) SHCIAEVEK S310-K318 E_(T)317 586.765 2 (SEQ ID NO: 88) 58 VHKECCHGDLLECADD V264-K285 707.8174 6  1.732 D_(A)278|D_(A)279| 670.807 4 16.7Å|15.2Å| RADLAK D_(A)282 10.6Å (SEQ ID NO: 54)| (SEQ ID NO: 55)| (SEQ ID NO: 56) YICDNQDTISSK Y286-K297 D_(T)289 772.330 2 (SEQ ID NO: 44) 59 YICDNQDTISSK Y286-K297 730.8301 4  1.435 D_(T)289 772.331 2 10.9Å (SEQ ID NO: 44) ECCDKPLLEK E300-K309 D_(T)303 680.325 2 (SEQ ID NO: 109) 60 YICDNQDTISSK Y286-K297 791.1262 4  3.024 D_(T)292 772.331 2 15.0Å (SEQ ID NO: 43) LKECCDKPLLEK L298-K309 D_(T)303 816.902 2 (SEQ ID NO: 86) 61 YICDNQDTISSK Y286-K297 791.1237 4 -0.136 D_(A)292 756.334 2 13.4Å|15.0Å (SEQ ID NO: 45) LKECCDKPLLEK L298-K309 E_(T)300|D_(T)303 816.900 2 (SEQ ID NO: 85)| (SEQ ID NO: 86) 62 YICDNQDTISSK Y286-K297 676.0577 4  2.777 D_(A)292 756.345 2 15.5Å (SEQ ID NO: 45) SHCIAEVEK S310-K318 E_(T)317 586.764 2 (SEQ ID NO: 88) Note: “I” means or; “&” means and; “T”: unsaturated thiol moiety; “A”: alkene moiety; “ox” means oxidation.

In some embodiments, the number of unique peptides obtained depends on several factors, including but not limited to, size, complexity, conformation, subunits, domains, etc., of the protein and/or protein complex. In some embodiments, the number of unique peptides range from about 5 to about 500. In some embodiments, the number of unique peptides range from about 10 to about 1000. In some embodiments, the number of unique peptides range from about 50 to about 5000. In some embodiments, the number of unique peptides range from about 500 to about 50,000. Collectively, the results presented thus far indicate that DHSO can effectively cross-link acidic residue containing peptides and proteins at neutral pH in the presence of DMTMM as the activating agent. More importantly, in some embodiments, the results demonstrate that DHSO cross-linked peptides indeed exhibit the same characteristic MS² fragmentation patterns as expected to allow their facile and accurate identification.

DHSO Cross-Linking Maps of Myoglobin and BSA

In order to assess the efficacy and sequence coverage of DHSO cross-linking on model proteins, cross-linking maps of myoglobin and BSA based on the identified DHSO inter-linked peptides were generated. The secondary structures of equine myoglobin comprise of eight α-helices and one short 3₁₀ helix (PDB: 1DWR) (FIG. 5A). The globular nature of myoglobin suggests that many of the helices are in close proximity to one another in three-dimensional space. The DHSO cross-link map of myoglobin based on the 33 unique E|D-E|D linkages is illustrated in FIG. 5B, describing numerous intra- and inter-secondary structure interactions (e.g. α1-α5, α1-α8, α2-α4, α3-α4, α3-α8, α4-α5, α4-α8, α6-α8, α7-α8, and α8-α8). To evaluate the legitimacy of identified cross-links, the cross-linked residues were mapped onto the crystal structure of myoglobin and calculated the distances between their alpha carbons (Cα-Cα distances) (FIG. 5D and FIG. 5F). Considering the length of the DHSO (12.5 Å) and the distances contributed by DIE side chains (4.5 Å|16.0 Å, respectively), as well as backbone flexibility and structural dynamics, the theoretical upper limit for the Cα-Cα distances between DHSO cross-linked acidic residues is estimated ˜35 Å. Therefore, the distance threshold for cross-linkable D/E residues was set as 35 Å. 27 of the 32 myoglobin DHSO cross-links were mapped in the structure, with all of them have a Cα-Cα distance <35 Å. The remaining 5 linkages not mapped on to the structure because they were identified as possible sites of oligomerization, in which identical residues or peptide sequences were cross-linked together.

Similarly, a DHSO cross-link map of BSA was generated based on the 69 unique E|D-E|D linkages (FIG. 6A). When mapped to a previously published BSA crystal structure (PDB: 4F5S), 65 out of 69 BSA linkages, 94%, were calculated to have Cα-Cα distances below 35 Å (FIG. 12A and FIG. 12C). Structural flexibility and/or oligomerization of BSA likely attribute to the other four identified linkages found to be >35 Å.

In some embodiments, the Cα-Cα distance ranges from about 5 Å to about 50 Å. In some embodiments, the Cα-Cα distance ranges from about 1 Å to about 100 Å. In some embodiments, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 0-5 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 5-10 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 10-15 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 15-20 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 20-25 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 25-30 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 30-35 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 35-40 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 40-45 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 45-50 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 50-55 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 55-60 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 60-65 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 65-70 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 70-75 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 75-80 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 80-85 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 85-90 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 90-95 Å, the number of cross-links range from about 1 to about 100. In some embodiments, for a Cα-Cα distance range of about 95-100 Å, the number of cross-links range from about 1 to about 100. In some embodiments, the number of cross-links range from about 100 to about 10,000 for one or more of the Cα-Cα distance ranges disclosed herein. In some embodiments, about 5% to about 45% of intra-links are within any of the ranges disclosed herein. In some embodiments, about 25% to about 65% of intra-links are within any of the ranges disclosed herein. In some embodiments, about 45% to about 99% of intra-links are within any of the ranges disclosed herein. In some embodiments, about 5% to about 45% of inter-links are within any of the ranges disclosed herein. In some embodiments, about 25% to about 65% of inter-links are within any of the ranges disclosed herein. In some embodiments, about 45% to about 99% of inter-links are within any of the ranges disclosed herein. In some embodiments, about 5% to about 45% of intra- and inter-links are within any of the ranges disclosed herein. In some embodiments, about 25% to about 65% of intra- and inter-links are within any of the ranges disclosed herein. In some embodiments, about 45% to about 99% of intra- and inter-links are within any of the ranges disclosed herein.

As shown in FIG. 6A, DHSO inter-links were distributed throughout the primary sequence of BSA, with regions of dense cross-link clusters identified in regions with higher α-helix density. This even distribution is likely due to the widespread dispersion of aspartic acid and glutamic acid residues throughout the protein. In some embodiments, the results herein suggest that DHSO cross-linking yields cross-links within expected distance constraints useful for structural elucidation for computational modeling in the same way as primary amine cross-linked data.

Comparison of MS-Cleavable and Non-Cleavable Acidic Residue Cross-Linking

Previously, two non-cleavable acidic residue cross-linkers, i.e., adipic acid dihydrazide (ADH) and pimetic acid dihydrazide (PDH), were used for probing the structure of bovine serum albumin²², which identified 27 and 35 unique acidic residue linkages, respectively²². A comparison of the linkage maps generated for DHSO, ADH, and PDH cross-linking of BSA (FIG. 6A-FIG. 6C) revealed a high degree of similarity in proximally cross-linked regions. Apart from covering interaction regions cross-linked by ADH and PDH, DHSO cross-linking resulted in 34 additional unique DIE-DIE linkages. ADH (FIG. 6B) and PDH (FIG. 6C) crosslink maps were generated based on data obtained by Leitner et al.²². These unique DHSO cross-links are generally clustered in regions of particularly high acidic residue density, such as the regions between D25 and D97, E250 and E344, and D405 to E494 (FIG. 6A-FIG. 6C). Limitations in bioinformatics software for analyzing non-cleavable cross-linked peptides have been previously noted 22, which makes it considerably more challenging for accurate identification of acidic residue cross-linked peptides due to their higher frequency than lysine and resulting increase in searching space. In contrast, CID induced cleavage of DHSO cross-linked peptides during MS² significantly simplify subsequent peptide sequencing with MS³. Given the same acidic residue reactive chemistry, the increase in identified cross-links using DHSO is mainly attributed to the simplified cross-link identification with improved accuracy afforded by MS-cleavability of DHSO cross-linked peptides. This ultimately facilitates unambiguous identification of individual linkages amidst peptides with multiple acidic residues in sequence. These results demonstrate the advantage of using DHSO, a MS-cleavable cross-linking reagent targeting acidic residues for probing protein-protein interactions over non-cleavable reagents.

Comparison of DHSO and DSSO Cross-Linking

To assess the complementarity between acidic residue and primary amine cross-linking data, the similarities and differences between DHSO and DSSO cross-linking of selected model proteins were examined. To this end, LC-MS^(n) analyses of DSSO cross-linked myoglobin and BSA respectively were performed. As summarized in Table 3 and Table 4, 19 unique DSSO inter-linked myoglobin peptides and 33 unique DSSO inter-linked BSA peptides were identified.

TABLE 3 Summary of DSSO Inter-Linked Myoglobin Peptides Identified by LC MS^(n) AA MS Δ Mod. MS2 Distance # Peptide Seq location m/z z (PPM) Position m/z z (Cα-Cα)  1 FDKFK F44-K48 618.5516 5  0.116 K_(T)46  770.355 1 14.6Å (SEQ ID NO: 111) KKGHHEAELKPLAQSHA K79-K99 K_(T)97  584.053 4 TKHK (SEQ ID NO: 112)  2 FDKFKHLKTEAEMK F44-K57 715.5761 5 -0.108 K_(A)46 & K_(A)48 &  649.311 3 11.9Å|15.7Å| (SEQ ID NO: 113) K_(T)51 18.3Å KHGTVVLTALGGILK K64-K78 K_(T)64  796.963 2 (SEQ ID NO: 114)  3 FKHLK F47-K51 492.2618 6  1.093 K_(A)48  363.719 2 27.1Å|17.1Å (SEQ ID NO: 115) GHHEAELKPLAQSHATK G81-K99 K_(T)88|K_(T)97  544.037 4 HK (SEQ ID NO: 119)| (SEQ ID NO: 120)  4 FKHLK F47-K51 590.5117 5 -0.608 K_(T)48  758.406 1 17.1Å (SEQ ID NO: 116) GHHEAELKPLAQSHATK G81-K99 K_(A)97  544.037 4 HK (SEQ ID NO: 121)  5 FKHLK F47-K51 391.9812 4 -0.157 K_(T)48  363.719 2 15.6Å (SEQ ID NO: 116) HKIPIK H98-K103 K_(A)99  411.239 2 (SEQ ID NO: 117)  6 FKHLKTEAEMKASEDLK F47-K64 970.3080 6  0.463 K_(A)48 & K_(T)51 &  776.385 3 26.1Å|24.4Å| K K_(A)57 15.4Å (SEQ ID NO: 122) YLEFISDAIIHVLHSKH Y104-K134 K_(T)119 1152.221 3 PGDFGADAQGAMTK (SEQ ID NO: 123)  7 HGTVVLTALGGILKK H65-K79 608.5053 6  1.553 K_(A)78  520.987 3 5.3Å (SEQ ID NO: 125) KGHHEAELKPLAQSHAT K80-K97 K_(T)80  690.017 3 K (SEQ ID NO: 126)  8 KHGTVVLTALGGILK K64-K78 647.3630 4 -0.850 K_(T)64  796.963 2 27.0Å (SEQ ID NO: 114) NDIAAKYK N141-K148 K_(A)146  488.759 2 (SEQ ID NO: 128)  9 KKGHHEAELKPLAQSHA K79-K97 837.6671 4 -1.356 K_(A)79 & K_(T)80 &  768.724 3 18.0Å, 18.4Å, TK K_(A)88 7.1Å (SEQ ID NO: 11) NDIAAKYK N141-K148 K_(T)146  976.510 1 (SEQ ID NO: 129) 10 KKGHHEAELKPLAQSH K79-K97 638.7356 5  0.486 K_(T)80|K_(T)88  549.789 4 18.4Å|7.1Å ATK (SEQ ID NO: 12)| (SEQ ID NO: 13) NDIAAKYK N141-K148 K_(T)146 1008.485 1 (SEQ ID NO: 129) 11 KGHHEAELKPLAQSHAT K80-K97 766.1448 4  1.629 K_(T)88  690.018 3 7.1Å K (SEQ ID NO: 127) NDIAAKYK N141-K148 K_(A)146  976.511 1 (SEQ ID NO: 128) 12 LFTGHPETLEKFDK L33-K46 416.0513 6  0.971 K_(A)43  429.721 4 8.0Å (SEQ ID NO: 130) FKHLK F47-K51 K_(A)48  363.719 2 (SEQ ID NO: 115) 13 LFTGHPETLEKFDK L33-K46 639.3386 4 -1.892 K_(T)43  874.421 2 7.7Å (SEQ ID NO: 131) HKIPIK H98-K103 K_(A)99  395.253 2 (SEQ ID NO: 117) 14 LFTGHPETLEKFDK L33-K46 737.5630 6   .054 K_(T)43  874.423 2 7.7Å|13.0Å (SEQ ID NO: 131) HKIPIKYLEFISDAIIH H98-K119 K_(A)99|KA103  664.631 4 VLHSK (SEQ ID NO: 132)| (SEQ ID NO: 133) 15 LFTGHPETLEKFDK L33-K46 801.4203 6  2.184 K_(T)43  874.421 2 7.7Å, 20.3Å, (SEQ ID NO: 131) 10.5Å KKGHHEAELKPLAQSHA K79-K103 K_(A)88 & K_(T)97 &  755.913 4 TKHKIPIK K_(A)99 (SEQ ID NO: 134) 16 LFTGHPETLEKFDK L33-K46 831.8438 5   .575 K_(T)43  874.421 2 13.0Å (SEQ ID NO: 131) IPIKYLEFISDAIIHVL I100-K119 K_(A)103  797.454 3 HSK (SEQ ID NO: 135) 17 LFTGHPETLEKFDKFKH L33-K51 561.7958 6 -1.21  K_(T)43 & K_(A)46 &  636.063 4 7.7Å, 15.0Å, LK K_(T)48 15.6Å (SEQ ID NO: 136) HKIPIK H98-K103 K_(A)99  395.253 2 (SEQ ID NO: 118) 18 TEAEMKASEDLK T52-K63 754.6490 4  0.028 K_(A)57  703.330 2 11.6Å (SEQ ID NO: 137) KHGTVVLTALGGILK K64-K78 K_(T)64  796.963 2 (SEQ ID NO: 114) 19 TEAEMKASEDLKK T52-K65 100.0855 5 -2.117 K_(A)57  767.378 2 15.4Å (SEQ ID NO: 138) YLEFISDAIIHVLHSKH Y104-K134 K_(A)119 1141.568 3 PGDFGADAQGAMTK (SEQ ID NO: 124) Note: “I” means or; “&” means and; “T”: unsaturated thiol moiety; “A”: alkene moiety; “ox” means oxidation.

TABLE 4 Summary of DSSO Inter-Linked BSA Peptides Identified by LC MS^(n) AA MS Δ Mod. MS2 Distance # Peptide Seq location m/z z (PPM) Position m/z z (Cα-Cα)  1 ALKAWSVAR A233-R241  744.1126 4 4.399 K_(A)235  528.305 2 13.5Å (SEQ ID NO: 139) LAKEYEATLEECCAK L372-K386 K_(T)374  950.911 2 (SEQ ID NO: 141)  2 ALKAWSVAR A233-R241  754.9589 5 3.924 K_(A)235  528.306 2 9.7Å (SEQ ID NO: 139) VHKECCHGDLLECADDR V264-K285 K_(T)266  900.056 3 ADLAK (SEQ ID NO: 58)  3 ALKAWSVAR A233-R241  578.8257 4 0.636 K_(T)235  544.291 2 9.0Å (SEQ ID NO: 140) LVTDLTKVHK L257-K266 K_(A)263  604.357 2 (SEQ ID NO: 142)  4 AEFVEVTKLVTDLTK A249-K263  948.9763 4 5.128 K_(A)256  873.984 2 13.8Å (SEQ ID NO: 144) ADLAKYICDNQDTISSK A281-K297 K_(T)285 1014.458 2 (SEQ ID NO: 145)  5 ADLAKYICDNQDTISSK A281-K297  889.6097 5 1.632 K_(T)285 1014.458 2 9.1Å (SEQ ID NO: 145) ECCDKPLLEKSHCIAEV E300-K318 K_(A)309  800.374 3 EK (SEQ ID NO: 146)  6 ADLAKYICDNQDTISSK A281-K297  889.6097 5 1.632 K_(T)285 1014.458 2 11.3Å|9.1Å (SEQ ID NO: 145) ECCDKPLLEKSHCIAEV E300-K318 K_(T)304|K_(T)309  811.031 3 EK (SEQ ID NO: 147)| (SEQ ID NO: 148)  7 CASIQKFGER C223-R232  792.6119 4 3.328 K_(A)228  625.306 2 16.4Å (SEQ ID NO: 149) LAKEYEATLEECCAK L372-K386 K_(T)374  950.910 2 (SEQ ID NO: 141)  8 CASIQKFGER C223-R232  723.8583 4 2.115 K_(A)228  625.305 2 13.2Å (SEQ ID NO: 149) LCVLHEKTPVSEK L483-K495 K_(T)489  813.408 2 (SEQ ID NO: 151)  9 CASIQKFGER C223-R232  705.5812 4 3.067 K_(A)228  641.290 2 13.1Å (SEQ ID NO: 149) VTKCCTESLVNR V496-R507 K_(T)498  776.852 2 (SEQ ID NO: 152) 10 CASIQKFGER C223-R232  900.6266 5 0.080 K_(A)228  625.305 2 13.2Å|17.6Å| (SEQ ID NO: 149) K_(T)489 & K_(T)495 & 13.1Å LCVLHEKTPVSEKVTKC L483-K507 K_(A)498 1071.836 3 CTESLVNR (SEQ ID NO: 154) 11 CASIQKFGER C223-R232  543.2742 4 0.293 K_(T)228  641.291 2 27.2Å (SEQ ID NO: 150) SLGKVGTR S452-R459 K_(A)455  436.254 2 (SEQ ID NO: 153) 12 DTHKSEIAHR D25-R34  520.8516 5 3.487 K_(A)28  416.544 3 13.4Å (SEQ ID NO: 53) FKDLGEEHFK F35-K44 K_(T)36  668.308 2 (SEQ ID NO: 155) 13 DTHKSEIAHR D25-R34  501.6659 5 3.313 K_(A)28  416.544 3 21.4Å (SEQ ID NO: 53) LVTDLTKVHK L257-K266 K_(T)263  620.343 2 (SEQ ID NO: 143) 14 FPKAEFVEVTK F246-K256  746.8786 4 3.617 K_(T)248  674.864 2 15.6Å (SEQ ID NO: 157) LKECCDKPLLEK L298-K309 K_(A)299  809.889 2 (SEQ ID NO: 87) 15 FPKAEFVEVTK F246-K256  784.8886 4 3.457 K_(A)248  674.863 2 11.7Å (SEQ ID NO: 158) YICDNQDTISSKLK Y286-K299 K_(T)297  885.909 2 (SEQ ID NO: 159) 16 FKDLGEEHFK F35-K44  803.7743 5 5.714 K_(A)36  652.321 2 9.2Å (SEQ ID NO: 156) LVNELTEFAKTCVADES L66-K88 K_(A)75  888.078 3 HAGCEK (SEQ ID NO: 160) 17 FKDLGEEHFK F35-K44  837.8948 4 5.217 K_(A)36  652.320 2 16.7Å (SEQ ID NO: 156) ADLAKYICDNQDTISSK A281-K297 K_(T)285 1014.458 2 (SEQ ID NO: 145) 18 HPYFYAPELLYYANKYN H169-K197 1014.2459 5 3.074 K_(A)183 1224.555 3 13.6Å GVFQECCQAEDK (SEQ ID NO: 161) ECCDKPLLEK E300-309 K_(A)304  673.313 2 (SEQ ID NO: 110) 19 HPYFYAPELLYYANKYN H169-K197 1036.6772 5 2.428 K_(T)183 1235.200 3 10.1Å GVFQECCQAEDK (SEQ ID NO: 162) GACLLPKIETM(ox)R G198-R209 K_(A)204  728.878 2 (SEQ ID NO: 163) 20 HKPKATEEQLK H558-K568  555.8971 5 0.899 K_(A)561  454.918 3 — (SEQ ID NO: 164) HKPKATEEQLK H558-K568 K_(T)561  697.863 2 (SEQ ID NO: 165) 21 KQTALVELLK K548-K557  879.4573 4 0.408 K_(A)548  598.868 2 14.3Å (SEQ ID NO: 167) ATEEQLKTVM(ox)ENF A562-K580 K_(T)568 1151.104 2 VAFVDK (SEQ ID NO: 168) 22 LKPDPNTLCDEFK L139-K151  795.8878 4 1.369 K_(T)140  831.881 2 14.7Å (SEQ ID NO: 80) FWGKYLYEIAR F157-R167 K_(A)160  750.391 2 (SEQ ID NO: 170) 23 LKPDPNTLCDEFK L139-K151  638.5690 4 0.425 K_(T)140  831.880 2 21.3Å (SEQ ID NO: 80) SLGKVGTR S452-R459 K_(A)455  436.255 2 (SEQ ID NO: 153) 24 LSQKFPK L242-K248  543.5061 4 4.415 K_(A)245  451.263 2 20.5Å (SEQ ID NO: 171) CCTKPESER C460-R468 K_(T)463  626.744 2 (SEQ ID NO: 74) 25 LKHLVDEPQNLIK L400-L412  603.3317 5 3.556 K_(T)401  816.948 2 42.1Å|38.3Å (SEQ ID NO: 95) HKPKATEEQLK H558-K568 K_(A)559|K_(A)561  454.920 3 (SEQ ID NO: 166)| (SEQ ID NO: 164) 26 LFTFHADICTLPDTEKQ L529-547  894.9782 4 6.080 K_(A)544 1166.089 2 6.4Å IK (SEQ ID NO: 172) KQTALVELLK K548-K557 K_(A)548  598.868 2 (SEQ ID NO: 167) 27 NECFLSHKDDSPDLPKL N123-K151  739.8589 6 2.696 K_(T)138  887.159 4 19.9Å KPDPNTLCDEFK (SEQ ID NO: 173) SLGKVGTR S452-R459 K_(A)455  436.254 2 (SEQ ID NO: 153) 28 DDSPDLPKLKPDPNTLC D131-K151  773.1790 5 3.565 K_(T)138|K_(T)140  991.79  3 19.9Å|21.3Å DEFK (SEQ ID NO: 174)| (SEQ ID NO: 175) SLGKVGTR S452-R459 K_(A)455  436.254 2 (SEQ ID NO: 153) 29 QNCDQFEKLGEYGFQNA Q413-R433 1226.3392 4 4.267 K_(T)420 1308.109 2 14.6Å LIVR (SEQ ID NO: 176) ATEEQLKTVM(ox)ENF A562-K580 K_(A)568 1135.061 2 VAFVDK (SEQ ID NO: 169) 30 QNCDQFEKLGEYGFQNA Q413-R433 1082.5466 4 1.824 K_(T)420 1308.113 2 22.3Å LIVR (SEQ ID NO: 176) KVPQVSTPTLVEVSR K437-R451 K_(A)437  847.479 2 (SEQ ID NO: 177) 31 SLGKVGTR S452-R459  536.0019 4 3.445 K_(A)455  436.255 2 13.5Å (SEQ ID NO: 153) CCTKPESER C460-R468 K_(T)463  626.743 2 (SEQ ID NO: 74) 32 TPVSEKVTK T490-K498  491.6657 5 3.475 K_(T)495  537.780 2 19.8Å|18.7Å (SEQ ID NO: 98) HKPKATEEQLK H558-K568 K_(A)559|K_(A)561  454.918 3 (SEQ ID NO: 166)| (SEQ ID NO: 164) 33 TPVSEKVTK T490-K498  614.3301 4 3.141 K_(A)495  521.794 2 18.7Å (SEQ ID NO: 99) HKPKATEEQLK H558-K568 K_(T)561  697.861 2 (SEQ ID NO: 165) Note: “I” means or; “&” means and; “T”: unsaturated thiol moiety; “A”: alkene moiety; “ox” means oxidation.

These linkages were then mapped onto their corresponding protein linear sequences (FIG. 5C and FIG. 6D) and the same crystal structures used in DHSO analysis (FIG. 5D and FIG. 12B). As shown, all of myoglobin DSSO cross-links and 94% of BSA DSSO cross-links corresponded to Cα-Cα distances ≤35 Å (FIG. 5D, FIG. 5E, FIG. 12B, and FIG. 12C). The two links that are outside the distance range may be a result of unexpected structural flexibility.

In the case of myoglobin, DSSO cross-linking identified several proximal helicase regions, such as α4-α5, α4-α7, α5-α8, α6-α8, and α5-3₁₀. In comparison, there is little overlap between DHSO and DSSO cross-link maps except the regions containing α4-α5 and α6-α8 (FIG. 5B and FIG. 5C), indicating that DHSO and DSSO cross-linking mapped different parts of interactions within myoglobin. The identified helicase interacting regions unique to DHSO or DSSO cross-linking correspond well with their cross-linkable residues and specific reactive chemistries. This is due to the fact that lysine and acidic residues are distributed unevenly across myoglobin sequence. For example, the N-terminal region of myoglobin (residues 1-41) spanning helices α1 through α3 contains only one lysine, but four glutamic acids and two aspartic acid residues. Therefore, profiling the interactions of the N-terminus within itself and with other parts of the protein will be difficult with amine-reactive cross-linking reagent such as DSSO. In contrast, acidic residue reactive cross-linker DHSO would be better suited for this purpose. Indeed, while DSSO was not able to cover this region as expected, DHSO cross-linking enabled the identification of 11 inter-linked peptides describing multiple interactions between the N-terminus and other parts of the protein (i.e. α1-α5, α1-α8, α2-α4, α3-α4, and α3-α8). While DHSO provided exclusive data from the lysine scarce N-terminal, the lysine-rich 3₁₀ helix and many of the loop regions between the helical structures were better analyzed by DSSO due to the higher abundance of lysine residues in these regions. Together, these results demonstrate that acidic residue cross-linking can provide complementary structural information to that obtained using amine-reactive cross-linkers.

In some embodiments, unlike myoglobin, DHSO and DSSO cross-linking of BSA have resulted in much more similar cross-linking profiles, meaning that similar interactions within BSA were identified (FIG. 6A and FIG. 6D). without being bound by any theory, this is most likely owing to the fact that BSA has more evenly dispersed distribution of lysine, aspartic acid, and glutamic acid residues throughout the protein sequence. Thus, complementary usage of DHSO and DSSO can strengthen the validity of the cross-links identified by any of the two reagents individually. More importantly, in some embodiments, this will generate complementary structural information to facilitate a more comprehensive understanding of protein structures.

EXAMPLES

The following Examples are non-limiting and other variants contemplated by one of ordinary skill in the art are included within the scope of this disclosure.

Example 1—Materials and Reagents

General chemicals were purchased from Fisher Scientific or VWR International. Bovine serum albumin (≥96% purity), myoglobin from equine heart (≥90% purity), and DMTMM (≥96% purity) were purchased from Sigma-Aldrich. Ac-SR8 peptide (Ac-SAKAYEHR (SEQ ID NO: 178), 98.22% purity) was custom ordered from Biomatik (Wilmington, Del.).

Example 2—Dihydrazide Sulfoxide (DHSO) Synthesis

Disuccinimidyl sulfoxide (DSSO) was synthesized as previously published¹⁶. The 2-step synthesis scheme for DHSO from DSSO is depicted in FIG. 1D. Briefly, tert-butyl carbazate (1.10 g, 8.32 mmol) was added to DSSO (1.41 g, 4.16 mmol) in dichloromethane (DCM) (50 mL). The resulting yellow solution was stirred at room temperature for 12 h, after which trifluoroacetic acid (2.20 mL, 28.7 mmol) was added. The resulting orange solution was stirred for 72 h before removing the solvent in vacuo. The resulting orange oil was dissolved in methanol, and then triethylamine was added. The resulting mixture was stirred for 20 min, after which a white solid had precipitated. The solid was collected via centrifuge, and then stirred with fresh methanol for 20 min. The solid was collected via centrifuge again, and this process of stirring with fresh methanol was repeated another two times. Drying the isolated white solid in vacuo afforded DHSO (0.375 g, 46%): mp 159-162° C.; ¹H NMR (500 MHz, DMSO-d₆): δ 9.13 (s, 2H), 4.24 (s, 4H), 3.0-2.97 (m, 2H), 2.83-2.75 (m, 2H), 2.43 (t, J=7.5 Hz, 4H); ¹³C NMR (125 MHz, DMSO-d₆): δ 169.3, 46.7, 26.2; IR (thin film): 3308, 3044, 1631, 1449, 1297, 1032 cm⁻¹; HRMS (ESI) m/z calculated for C₆H₁₅N₄O₃S [M+H]+223.0865, found 223.0857.

Example 3—DHSO Cross-Linking of Synthetic Peptides

Synthetic peptide Ac-SR8 was dissolved in DMSO to 1 mM and cross-linked with DHSO in a 1:1 molar ratio of peptide to cross-linker in the presence of 1 equivalent of diisopropylethylamine and DMTMM. The resulting samples were diluted to 10 pmol/μL in 3% ACN/2% formic acid prior to MS^(n) analysis.

Example 4—DHSO Cross-Linking of Equine Myoglobin and Bovine Serum Albumin

50 μL of 50 μM BSA or 200 μM myoglobin in PBS buffer (pH 7.4) was reacted with DHSO in molar ratios of 1:5, 1:10, 1:20, and 1:30. The cross-linking reaction was initiated by adding equivalent concentrations of DHSO and DMTMM to protein solutions, reacted for 1 h at room temperature.

Example 5—Digestion of DHSO Cross-Linked Proteins

Cross-linked protein samples were subjected to either SDS-PAGE followed by in-gel digestion, or directly digested in solution. For in-gel digestion, cross-linked proteins were separated by SDS-PAGE and visualized by Coomassie blue staining. The selected bands were excised, reduced with TCEP for 30 min, alkylated with iodoacetamide for 30 min in the dark, and then digested with trypsin at 37° C. overnight. Peptide digests were extracted, concentrated, and reconstituted in 3% ACN/2% formic acid for MS^(n) analysis. For in-solution digestion, cross-linked proteins were first precipitated with TCA and then re-suspended in 8M urea buffer. Reduction and alkylation were performed prior to Lys-C/trypsin digestion as previously described²⁴. The resulting digests were desalted using Waters C18 Sep-Pak cartridges and fractionated by peptide size exclusion chromatography (SEC) based on the protocol by Leitner et al.²⁵. The fractions containing cross-linked peptides were collected for subsequent MS^(n) analysis.

Example 6—Liquid Chromatography-Multistage Tandem Mass Spectrometry (LC MS^(n)) Analysis

DHSO cross-linked peptides were analyzed by LC-MS^(n) utilizing an Easy-nLC 1000 (Thermo Fisher, San Jose, Calif.) coupled on-line to an LTQ-Orbitrap XL mass spectrometer (Thermo Fisher, San Jose, Calif.) as previously described^(16,17). Each MS^(n) experiment consists of one MS scan in FT mode (350-1400 m/z, resolution of 60,000 at m/z 400) followed by two data-dependent MS² scans in FT mode (resolution of 7500) with normalized collision energy at 10% on the top two MS peaks with charges 4+ or higher, and three MS³ scans in the LTQ with normalized collision energy at 35% on the top three peaks from each MS².

Example 7—Data Analysis and Identification of DHSO Cross-Linked Peptides

Monoisotopic masses of parent ions and corresponding fragment ions, parent ion charge states, and ion intensities from MS² and MS³ spectra were extracted using in-house software based on the Raw_Extract script from Xcalibur v2.4 (Thermo Scientific) as previously described¹⁷. MS³ data was subjected to a developmental version of Protein Prospector (v.5.16.0) for database searching, using Batch-Tag against SwissProt.2014.12.4.random.concat databases limited to either the Bos taurus or Equus caballus taxonomy with mass tolerances for parent ions and fragment ions set as ±20 ppm and 0.6 Da, respectively. Trypsin was set as the enzyme with four maximum missed cleavages allowed. Cysteine carbamidomethylation was set as a constant modification. A maximum of four variable modifications were also allowed, including protein N-terminal acetylation, methionine oxidation, and N-terminal conversion of glutamine to pyroglutamic acid. In addition, three defined modifications representing cross-linker fragment moieties on aspartic acid and glutamic acid were selected: alkene (A, C₃H₄N₂, +68 Da), sulfenic acid (S, C₃H₆N₂SO, +118 Da), and unsaturated thiol (T, C₃H₄N₂S, +100 Da) modifications. Initial acceptance criteria for peptide identification required a reported expectation value ≤0.1. The in-house program XL-Discoverer, a revised version of previously developed Link-Hunter, was used to validate and summarize cross-linked peptides based on MS^(n) data and database searching¹⁶.

Example 8—Dissecting Structural Dynamics of Human COP9 Signalosome (CSN)

A novel approach to integration of multiplexed quantitation and different cross-linking chemistries for comprehensive structural comparison of protein complexes is provided. Cross-linking mass spectrometry (XL-MS) has become a powerful structural tool for elucidating architectures of protein complexes. Current XL-MS studies predominantly have used lysine targeting cross-linkers. However, there remains difficulty in characterizing interaction interfaces with limited lysine residues. Recently we developed an acidic residue targeting cross-linker, dihydrazide sulfoxide (DHSO)¹⁶, enabling improved coverage of protein interactions. As protein complexes are dynamic entities, which require quantitative XL-MS strategies for comparative analysis. To improve throughput and quantitation accuracy, a QMIX (Quantitation of Multiplexed Isobaric labelled cross-linked (X) peptides) strategy²⁸ was developed for simultaneous comparison of multiple conformation states of protein complexes. The QMIX strategy was integrated with cross-linkers targeting different residues to dissect structural dynamics of human COP9 signalosome (CSN).

Two types of CSN complexes, CSN I and CSN II were purified, and cross-linked using DSSO or DHSO, respectively. The resulting cross-linked proteins were digested and cross-linked peptides were isolated using peptide SEC. Following SEC separation, selected fractions were labeled with TMT 6-plex™ isobaric Reagents (Thermo Scientific). The isobaric reagents labeled cross-linked peptides were mixed equally and analyzed by LC-MS^(n) using an Easy-nLC 1200 (Thermo Scientific) coupled on-line to an Orbitrap Fusion Lumos Tribrid MS (Thermo Scientific). The resulting mass spectrometric data was subjected to protein database searching using Protein Prospector. Quantitation and cross-link identification was performed using in-house software package XL-Discoverer, designed to integrate various layers of MS^(n) data to automatically summarize and validate the identification of cross-linking peptides.

CSN is a protein complex that functions as a deneddylase for cullin-RING ligases (CRLs) and thus modulates the assembly and action of CRLs. Although CSN is known to consist of eight subunits, a new stoichiometric subunit (i.e., CSN 9) has been recently identified¹⁶ that can enhance the CSN activity. To determine the molecular details underlying composition-dependent activation and action mechanisms of the CSN complex, we have established a new integrated strategy combining a multiplexed QXL-MS strategy (QMIX28) and the two sulfoxide-containing MS-cleavable cross-linkers with diverse functional groups, DSSO²⁹ and DHSO³⁰. The identification of cross-linked peptides is based on the XL-MS workflow designed for sulfoxide-containing MS-cleavable cross-linkers using multistage tandem mass spectrometry (MS^(n))^(28,16). Initial results from our XL-MS studies have identified 22 inter-subunit interactions comprised of 160 inter-subunit linkages using DHSO, and 19 inter-subunit interactions comprised of 107 inter-subunit linkages using DSSO. Additionally, 150 DSSO and 215 DHSO intra-subunit linkages were identified. Mapping of the linkages the CSN crystal structure (4D10), revealed that 84% of DSSO intra-links and 89% of DHSO intralinks were <30 Å. The detection of violated cross-links suggests an alternative positioning or unexpected flexibility in parts of the CSN complex. Concurrent usage of lysine and acidic residue targeting cross-linkers, has allowed for both validation of complimentary identified regions as well as elucidated regions where either lysine or acidic resides are sparse. The results have provided new structural details for refining the CSN structure and understanding its action mechanisms. The technologies presented here can be easily adapted to study other protein complexes.

Perspectives

In some embodiments, provided herein is the development and characterization of a new acidic residue-targeting, sulfoxide-containing MS-cleavable cross-linker, dihydrazide sulfoxide (DHSO), which is a new derivative of inventors' previously developed amine-reactive MS-cleavable reagent, DSSO¹⁶. In some embodiments, the extensive analyses herein have proven that DHSO cross-linked peptides possesses the same characteristics distinctive to peptides cross-linked by other sulfoxide-containing amine reactive cross-linkers¹⁶⁻¹⁸, thus enabling their simplified identification by MS^(n) analysis. The unique features of DHSO will significantly facilitate cross-linking studies targeting acidic residues, which has been difficult in the past due to the large number of DIE present in protein sequences and complexity of their resulting cross-linked peptides for MS analysis.

Comparison of DHSO and DSSO cross-linking confirms the need of expanding the coverage of protein interactions using cross-linkers targeting different residues, especially when the distribution of specific amino acids is uneven. In some embodiments, this disclosure demonstrates the robustness and potential of the XL-MS technology based on sulfoxide-containing MS-cleavable cross-linkers and provides a viable analytical platform for the development of new MS-cleavable cross-linker derivatives to further define protein-protein interactions. Without being bound by any theory, it is believed that the development of these new tools will aid in the goal of understanding the structural dynamics of protein complexes and their mechanistic functions at the global scale in the future.

In some embodiments, the workflow disclosed herein will allow fast, effective, accurate and robust identification of DHSO cross-linked peptides. In some embodiments, the results herein demonstrate that DHSO cross-linked peptides possess the same characteristics distinctive to peptides cross-linked by other sulfoxide-containing amine reactive crosslinkers previously developed by the inventors, thus enabling their simplified identification by MS^(n) analysis. The development of DHSO cross-linking will aid in the goal of defining protein-protein interactions at the global scale and understanding the structural dynamics of protein complexes and their mechanistic functions in cells.

As used herein, the section headings are for organizational purposes only and are not to be construed as limiting the described subject matter in any way. All literature and similar materials cited in this application, including but not limited to, patents, patent applications, articles, books, treatises, and internet web pages are expressly incorporated by reference in their entirety for any purpose. When definitions of terms in incorporated references appear to differ from the definitions provided in the present teachings, the definition provided in the present teachings shall control. It will be appreciated that there is an implied “about” prior to the temperatures, concentrations, times, etc. discussed in the present teachings, such that slight and insubstantial deviations are within the scope of the present teachings herein.

In this application, the use of the singular includes the plural unless specifically stated otherwise. Also, the use of “comprise”, “comprises”, “comprising”, “contain”, “contains”, “containing”, “include”, “includes”, and “including” are not intended to be limiting.

As used in this specification and claims, the singular forms “a,” “an” and “the” include plural references unless the content clearly dictates otherwise.

As used herein, “about” means a quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length that varies by as much as 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1% to a reference quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length.

Although this disclosure is in the context of certain embodiments and examples, those skilled in the art will understand that the present disclosure extends beyond the specifically disclosed embodiments to other alternative embodiments and/or uses of the embodiments and obvious modifications and equivalents thereof. In addition, while several variations of the embodiments have been shown and described in detail, other modifications, which are within the scope of this disclosure, will be readily apparent to those of skill in the art based upon this disclosure. It is also contemplated that various combinations or sub-combinations of the specific features and aspects of the embodiments may be made and still fall within the scope of the disclosure. It should be understood that various features and aspects of the disclosed embodiments can be combined with, or substituted for, one another in order to form varying modes or embodiments of the disclosure. Thus, it is intended that the scope of the present disclosure herein disclosed should not be limited by the particular disclosed embodiments described above.

Abbreviations

XL-MS: cross-linking mass spectrometry DSSO: disuccinimidyl sulfoxide

DMDSSO: dimethyl disuccinimidyl sulfoxide

Azide-A-DSBSO: Azide-tagged, Acid-cleavable DiSuccinimidyl BisSulfoxide

DHSO: dihydrazide sulfoxide, a.k.a. 3,3′-sulfinyldi(propanehydrazide)

MS: mass spectrometry

MS²: tandem mass spectrometry

MS³: third stage tandem mass spectrometry

MS^(n): multi-stage tandem mass spectrometry

CID: collisional induced dissociation

LC MS^(n): liquid chromatography multistage tandem mass spectrometry

Asp: aspartic acid

Glu: glutamic acid

REFERENCES

All references cited in this disclosure are incorporated herein by reference in their entireties.

-   (1) Sinz, A. Mass Spectrom Rev 2006, 25, 663-682. -   (2) Walzthoeni, T.; Leitner, A.; Stengel, F.; Aebersold, R. Curr     Opin Struct Biol 2013, 23, 252-260. -   (3) Chen, Z. A.; Jawhari, A.; Fischer, L.; Buchen, C.; Tahir, S.;     Kamenski, T.; Rasmussen, M.; Lariviere, L.; Bukowski-Wills, J. C.;     Nilges, M.; Cramer, P.; Rappsilber, J. EMBO J 2010, 29, 717-726. -   (4) Herzog, F.; Kahraman, A.; Boehringer, D.; Mak, R.; Bracher, A.;     Walzthoeni, T.; Leitner, A.; Beck, M.; Hartl, F. U.; Ban, N.;     Malmstrom, L.; Aebersold, R. Science 2012, 337, 1348-1352. -   (5) Kao, A.; Randall, A.; Yang, Y.; Patel, V. R.; Kandur, W.; Guan,     S.; Rychnovsky, S. D.; Baldi, P.; Huang, L. Mol Cell Proteomics     2012, 11, 1566-1577. -   (6) Erzberger, J. P.; Stengel, F.; Pellarin, R.; Zhang, S.;     Schaefer, T.; Aylett, C. H.; Cimermancic, P.; Boehringer, D.; Sali,     A.; Aebersold, R.; Ban, N. Cell 2014, 158, 1123-1135. -   (7) Shi, Y.; Fernandez-Martinez, J.; Tjioe, E.; Pellarin, R.;     Kim, S. J.; Williams, R.; Schneidman, D.; Sali, A.; Rout, M. P.;     Chait, B. T. Mol Cell Proteomics 2014, 13, 2927-2943. -   (8) Zeng-Elmore, X.; Gao, X. Z.; Pellarin, R.; Schneidman-Duhovny,     D.; Zhang, X. J.; Kozacka, K. A.; Tang, Y.; Sali, A.; Chalkley, R.     J.; Cote, R. H.; Chu, F. J Mol Biol 2014. -   (9) Liu, J.; Yu, C.; Hu, X.; Kim, J. K.; Bierma, J. C.; Jun, H. I.;     Rychnovsky, S. D.; Huang, L.; Qiao, F. Cell reports 2015, 12,     2169-2180. -   (10) Yang, L.; Tang, X.; Weisbrod, C.; Munske, G.; Eng, J.; von     Haller, P.; Kaiser, N.; Bruce, J. Analytical Chemistry 2010,     663-682. -   (11) Kasper, P. T.; Back, J. W.; Vitale, M.; Hartog, A. F.;     Roseboom, W.; de Koning, L. J.; van Maarseveen, J. H.; Muijsers, A.     O.; de Koster, C. G.; de Jong, L. Chembiochem 2007, 8, 1281-1292. -   (12) Lu, Y.; Tanasova, M.; Borhan, B.; Reid, G. E. Anal Chem 2008,     80, 9279-9287. -   (13) Zhang, H.; Tang, X.; Munske, G. R.; Tolic, N.; Anderson, G. A.;     Bruce, J. E. Mol Cell Proteomics 2009, 8, 409-420. -   (14) Muller, M. Q.; Dreiocker, F.; Ihling, C. H.; Schafer, M.;     Sinz, A. Anal Chem 2010, 82, 6958-6968. -   (15) Petrotchenko, E. V.; Serpa, J. J.; Borchers, C. H. Mol Cell     Proteomics 2011, 10, M110.001420. -   (16) Kao, A.; Chiu, C. L.; Vellucci, D.; Yang, Y.; Patel, V. R.;     Guan, S.; Randall, A.; Baldi, P.; Rychnovsky, S. D.; Huang, L. Mol     Cell Proteomics 2011, 10, M110.002212. -   (17) Yu, C.; Kandur, W.; Kao, A.; Rychnovsky, S.; Huang, L. Anal     Chem 2014, 86, 2099-2106. -   (18) Kaake, R. M.; Wang, X.; Burke, A.; Yu, C.; Kandur, W.; Yang,     Y.; Novtisky, E. J.; Second, T.; Duan, J.; Kao, A.; Guan, S.;     Vellucci, D.; Rychnovsky, S. D.; Huang, L. Mol Cell Proteomics 2014,     13, 3533-3543. -   (19) Yu, C.; Mao, H.; Novitsky, E. J.; Tang, X.; Rychnovsky, S. D.;     Zheng, N.; Huang, L. Nature communications 2015, 6, 10053. -   (20) Belsom, A.; Schneider, M.; Fischer, L.; Brock, O.;     Rappsilber, J. Mol Cell Proteomics 2016, 15, 1105-1116. -   (21) Apweiler, R., et al.; UniProt Consortium Nucleic Acids Res     2013, 41, D43-47. -   (22) Leitner, A.; Joachimiak, L. A.; Unverdorben, P.; Walzthoeni,     T.; Frydman, J.; Forster, F.; Aebersold, R. Proc Natl Acad Sci USA     2014, 111, 9455-9460. -   (23) Novak, P.; Kruppa, G. H. Eur J Mass Spectrom (Chichester, Eng)     2008, 14, 355-365. -   (24) Wang, X.; Chen, C. F.; Baker, P. R.; Chen, P. L.; Kaiser, P.;     Huang, L. Biochemistry 2007, 46, 3553-3565. -   (25) Leitner, A.; Walzthoeni, T.; Aebersold, R. Nat Protoc 2014, 9,     120-137. -   (26) Schilling, B.; Row, R. H.; Gibson, B. W.; Guo, X.; Young, M. M.     J Am Soc Mass Spectrom. 2003, 14, 834-850. -   (27) Liu, F.; Rijkers, D. T.; Post, H.; Heck, A. J. Nat. Methods     2015, 12, 1179-1184. -   (28) Craig B. Gutierrez, et al., Developing an Acidic Residue     Reactive and Sulfoxide-Containing MS-Cleavable Homobifunctional     Cross-Linker for Probing Protein-Protein Interactions, Anal. Chem.     2016, 88, 8315-8322. -   (29) Yu, C. et al, Anal. Chem, 2016. -   (30) Rozen, S. et al, Cell Reports, 2015. 

What is claimed is:
 1. An MS-cleavable cross-linker consisting of: two hydrazide reactive groups; a spacer arm with one central sulfoxide group, wherein the one central sulfoxide group is linked to each of the two hydrazide reactive groups through two methylene groups and a carbonyl group; and two symmetric collision-induced dissociation (CID) cleavable bonds on the spacer arm, wherein each of the two CID cleavable bonds is a C—S bond adjacent to the one central sulfoxide group, and wherein the MS-cleavable cross-linker is configured for mapping intra-protein interactions in a protein, or inter-protein interactions in a protein complex, or combinations thereof.
 2. The MS-cleavable cross-linker of claim 1, wherein each of the two hydrazide reactive group is configured to react with an activated acidic side chain in the protein or protein complex.
 3. The MS-cleavable cross-linker of claim 1, wherein the MS-cleavable crosslinker is dihydrazide sulfoxide (DHSO), having the structure:


4. The MS-cleavable cross-linker of claim 3, wherein the two hydrazide reactive groups are separated by a 12.5 Å long spacer arm consisting of the two symmetrical CID cleavable C—S flanking the one sulfoxide group.
 5. A method for synthesis of an MS-cleavable cross-linker according to claim 1 comprising the steps of: (i) providing disuccinimidyl sulfoxide (DSSO) in dicholormethane; (ii) adding tert-butyl carbazate to derive a first solution; (iii) stirring the first solution of step (ii); (iv) adding trifluoroacetic acid to derive an second solution; (v) stirring the second solution of step (iv); (vi) removing solvent from the second solution of step (v) in vacuo to derive an oil; (vii) dissolving the oil from step (vi) in methanol and adding trimethylamine to obtain a mixture; (viii) stirring the mixture from step (vii) to obtain a precipitate; (ix) collecting the precipitate from step (viii), washing the precipitate in fresh methanol, and drying the precipitate in vacuo, thereby obtaining the MS-cleavable cross-linker according to claim
 1. 6. The method of claim 5, wherein stirring the first solution of step (ii) is performed at room temperature.
 7. The method of claim 6, wherein stirring the first solution of step (ii) is performed for about 6 h to about 24 h.
 8. The method of claim 5, wherein stirring the second solution of step (iv) is performed at room temperature.
 9. The method of claim 8, wherein stirring the second solution of step (iv) is performed for about 36 h to about 144 h.
 10. The method of claim 5, wherein stirring the mixture is performed at room temperature.
 11. The method of claim 10, wherein stirring the mixture is performed for about 10 min to about 40 min.
 12. The method of claim 5, wherein collecting the precipitate from step (viii), washing the precipitate in fresh methanol is performed about 1 to about 10 times.
 13. The method of claim 5, wherein the MS-cleavable cross-linker is DHSO, having the structure:


14. A method for mapping intra-protein interactions in a protein, inter-protein interactions in a protein complex, or combinations thereof, the method comprising: providing 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM) as an activating agent; activating at least one acidic residue by DMTMM in the protein or protein complex; providing an MS-cleavable cross-linker according to claim 1, cross-linking the MS-cleavable cross-linker to the at least one activated acidic residue in the protein or protein complex; digesting with an enzyme the protein or protein complex cross-linked to the MS-cleavable cross-linker; generating one or more peptide fragments of the protein or protein complex, wherein the one or more peptide fragments are chemically cross-linked to the MS-cleavable cross-linker; and identifying the one or more peptide fragments using tandem mass spectrometry (MS^(n)), thereby mapping intra-protein interactions in the protein and/or inter-protein interactions in the protein complex.
 15. The method of claim 14, wherein the MS-cleavable cross-linker is DHSO, having the structure


16. A method for cross-linking mass spectrometry (XL-MS) for identifying one or more cross-linked peptides, the method comprising: providing 4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride (DMTMM) as an activating agent; activating at least one acidic residue by DMTMM in a protein or a protein complex; providing an MS-cleavable cross-linker according to claim 1, cross-linking the MS-cleavable cross-linker to the at least one activated acidic residue in the protein or protein complex; digesting with an enzyme the protein or protein complex cross-linked to the MS-cleavable cross-linker; generating one or more peptide fragments of the protein or protein complex, wherein the one or more peptide fragments are chemically cross-linked to the MS-cleavable cross-linker; performing a liquid chromatography-tandem mass spectrometry (LC-MS^(n)) analysis on the one or more cross-linked peptides, wherein the LC-MS^(n) analysis comprises: detecting the one or more cross-linked peptides by MS¹ analysis; selecting the one or more cross-linked peptides detected by MS¹ for MS² analysis; selectively fragmenting the at least one CID cleavable bond and separating the one or more cross-linked peptides during MS² analysis; sequencing the one or more cross-linked peptides separated during MS² analysis by MS³ analysis; and integrating data obtained during MS¹, MS² and MS³ analyses to identify the one or more cross-linked peptides.
 17. The method of claim 16, wherein the MS-cleavable cross-linker is DHSO, having the structure 